On Enabling Techniques for Personal Audio Content Management

Tommi Lahti, Marko Helén, Olli Vuorinen, Eero Väyrynen, Juha Partala, Johannes Peltola, Satu-Marja Mäkelä

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)


State-of-the-art automatic analysis tools for personal audio con-tent management are discussed in this paper. Our main target is to create a system, which has several co-operating management tools for audio database and which improve the results of each other. Bayesian networks based audio classification algorithm provides classification into four main audio classes (silence, speech, music, and noise) and serves as a first step for other subsequent analysis tools. For speech analysis we propose an improved Bayesian information criterion based speaker segmen-tation and clustering algorithm applying also a combined gender and emotion detection algorithm utilizing prosodic features. For the other main classes it is often hard to device any general and well functional pre-categorization that would fit the unforesee-able types of user recorded data. For compensating the absence of analysis tools for these classes we propose the use of efficient audio similarity measure and query-by-example algorithm with database clustering capabilities. The experimental results show that the combined use of the algorithms is feasible in practice.
Original languageEnglish
Title of host publicationMIR '08 Proceeding of the 1st ACM international conference on multimedia information retrieval
PublisherAssociation for Computing Machinery ACM
ISBN (Print)978-1-60558-312-9
Publication statusPublished - 2008
MoE publication typeA4 Article in a conference publication
Event1st ACM International Conference on Multimedia Information Retrieval, MIR 2008 - Vancouver, Canada
Duration: 30 Oct 201031 Oct 2010


Conference1st ACM International Conference on Multimedia Information Retrieval, MIR 2008
Abbreviated titleMIR 2008


  • Personal audio content management
  • audio classification
  • speaker segmentation
  • emotion detection
  • query-by-example

Fingerprint Dive into the research topics of 'On Enabling Techniques for Personal Audio Content Management'. Together they form a unique fingerprint.

Cite this