Open this publication in new window or tab >>Department of Ecoscience, Aarhus University, Roskilde, Denmark.
Girton College, Department of Zoology, University of Cambridge, Cambridge, UK.
Department of Psychology, University of Warwick, Coventry, UK.
Human Biology Program, Michigan State University, East Lansing, MI, USA.
Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic.
Department of Biology and Emory National Primate Research Center, Emory University, ATL, USA.
Department of Zoology, Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic.
Department of Computer Science, Oxford University, Oxford, UK.
Department of Computer Science, San Diego State University, San Diego, CA, USA.
School of Life and Environmental Sciences, University of Lincoln, Lincoln, UK.
Czech Academy of Sciences, Institute of Vertebrate Biology, Brno, Czech Republic; Faculty of Environmental Sciences, Czech University of Life Sciences Prague, Prague, Czech Republic; Forestry and Game Management Research Institute, Jíloviště, Czech Republic.
Department of Integrative Biology, Michigan State University, East Lansing, MI, USA; Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, USA; Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI, USA.
Biology Department, University of Konstanz, Konstanz, Germany; Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Konstanz, Germany.
Department of Biology and Emory National Primate Research Center, Emory University, ATL, USA.
Wildlife Conservation Research Unit, Recanati-Kaplan Centre, Department of Biology, University of Oxford, Oxford, UK.
Université de Toulon, Aix Marseille University, CNRS, LIS, Toulon, France.
Show others...
2025 (English)In: Bioacoustics, ISSN 0952-4622, E-ISSN 2165-0586, Vol. 34, no 4, p. 419-446Article in journal (Refereed) Published
Abstract [en]
The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g. population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 14 taxa, each paired with ground truth F0 values. These vocalisations range from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. Testing different algorithms on these signals, we demonstrate the potential of neural networks for F0 estimation, even for taxa not seen in training, or when trained without labels. Also, to inform on the applicability of algorithms to analyse signals, we propose spectral measurements of F0 quality which correlate well with performance. While current performance results are not satisfying for all studied taxa, they suggest that deep learning could bring a more generic and reliable bioacoustic F0 tracker, helping the community to analyse vocalisations via their F0 contours.
Place, publisher, year, edition, pages
Informa UK Limited, 2025
Keywords
cross-species dataset, deep learning, Fundamental frequency (F0), vocalisation analysis
National Category
Artificial Intelligence
Identifiers
urn:nbn:se:kth:diva-366189 (URN)10.1080/09524622.2025.2500380 (DOI)001501315800001 ()2-s2.0-105007437974 (Scopus ID)
Note
QC 20250704
2025-07-042025-07-042025-07-04Bibliographically approved