TY - JOUR
T1 - Voice Analysis in Dogs with Deep Learning
T2 - Development of a Fully Automatic Voice Analysis System for Bioacoustics Studies
AU - Karaaslan, Mahmut
AU - Turkoglu, Bahaeddin
AU - Kaya, Ersin
AU - Asuroglu, Tunc
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/12
Y1 - 2024/12
N2 - Extracting behavioral information from animal sounds has long been a focus of research in bioacoustics, as sound-derived data are crucial for understanding animal behavior and environmental interactions. Traditional methods, which involve manual review of extensive recordings, pose significant challenges. This study proposes an automated system for detecting and classifying animal vocalizations, enhancing efficiency in behavior analysis. The system uses a preprocessing step to segment relevant sound regions from audio recordings, followed by feature extraction using Short-Time Fourier Transform (STFT), Mel-frequency cepstral coefficients (MFCCs), and linear-frequency cepstral coefficients (LFCCs). These features are input into convolutional neural network (CNN) classifiers to evaluate performance. Experimental results demonstrate the effectiveness of different CNN models and feature extraction methods, with AlexNet, DenseNet, EfficientNet, ResNet50, and ResNet152 being evaluated. The system achieves high accuracy in classifying vocal behaviors, such as barking and howling in dogs, providing a robust tool for behavioral analysis. The study highlights the importance of automated systems in bioacoustics research and suggests future improvements using deep learning-based methods for enhanced classification performance.
AB - Extracting behavioral information from animal sounds has long been a focus of research in bioacoustics, as sound-derived data are crucial for understanding animal behavior and environmental interactions. Traditional methods, which involve manual review of extensive recordings, pose significant challenges. This study proposes an automated system for detecting and classifying animal vocalizations, enhancing efficiency in behavior analysis. The system uses a preprocessing step to segment relevant sound regions from audio recordings, followed by feature extraction using Short-Time Fourier Transform (STFT), Mel-frequency cepstral coefficients (MFCCs), and linear-frequency cepstral coefficients (LFCCs). These features are input into convolutional neural network (CNN) classifiers to evaluate performance. Experimental results demonstrate the effectiveness of different CNN models and feature extraction methods, with AlexNet, DenseNet, EfficientNet, ResNet50, and ResNet152 being evaluated. The system achieves high accuracy in classifying vocal behaviors, such as barking and howling in dogs, providing a robust tool for behavioral analysis. The study highlights the importance of automated systems in bioacoustics research and suggests future improvements using deep learning-based methods for enhanced classification performance.
KW - audio processing
KW - automatic behavior analysis
KW - bioacoustics
KW - CNN
UR - http://www.scopus.com/inward/record.url?scp=85213281431&partnerID=8YFLogxK
U2 - 10.3390/s24247978
DO - 10.3390/s24247978
M3 - Article
AN - SCOPUS:85213281431
SN - 1424-8220
VL - 24
JO - Sensors
JF - Sensors
IS - 24
M1 - 7978
ER -