Abstract
State-of-the-art automatic analysis tools for personal audio con-tent
management are discussed in this paper. Our main target is to create a system,
which has several co-operating management tools for audio database and which
improve the results of each other. Bayesian networks based audio
classification algorithm provides classification into four main audio classes
(silence, speech, music, and noise) and serves as a first step for other
subsequent analysis tools. For speech analysis we propose an improved Bayesian
information criterion based speaker segmen-tation and clustering algorithm
applying also a combined gender and emotion detection algorithm utilizing
prosodic features. For the other main classes it is often hard to device any
general and well functional pre-categorization that would fit the
unforesee-able types of user recorded data. For compensating the absence of
analysis tools for these classes we propose the use of efficient audio
similarity measure and query-by-example algorithm with database clustering
capabilities. The experimental results show that the combined use of the
algorithms is feasible in practice.
Original language | English |
---|---|
Title of host publication | MIR '08 Proceeding of the 1st ACM international conference on multimedia information retrieval |
Publisher | Association for Computing Machinery ACM |
Pages | 113-120 |
ISBN (Print) | 978-1-60558-312-9 |
DOIs | |
Publication status | Published - 2008 |
MoE publication type | A4 Article in a conference publication |
Event | 1st ACM International Conference on Multimedia Information Retrieval, MIR 2008 - Vancouver, Canada Duration: 30 Oct 2010 → 31 Oct 2010 |
Conference
Conference | 1st ACM International Conference on Multimedia Information Retrieval, MIR 2008 |
---|---|
Abbreviated title | MIR 2008 |
Country/Territory | Canada |
City | Vancouver |
Period | 30/10/10 → 31/10/10 |
Keywords
- Personal audio content management
- audio classification
- speaker segmentation
- emotion detection
- query-by-example