On Enabling Techniques for Personal Audio Content Management

Tommi Lahti, Marko Helén, Olli Vuorinen, Eero Väyrynen, Juha Partala, Johannes Peltola, Satu-Marja Mäkelä

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    1 Citation (Scopus)

    Abstract

    State-of-the-art automatic analysis tools for personal audio con-tent management are discussed in this paper. Our main target is to create a system, which has several co-operating management tools for audio database and which improve the results of each other. Bayesian networks based audio classification algorithm provides classification into four main audio classes (silence, speech, music, and noise) and serves as a first step for other subsequent analysis tools. For speech analysis we propose an improved Bayesian information criterion based speaker segmen-tation and clustering algorithm applying also a combined gender and emotion detection algorithm utilizing prosodic features. For the other main classes it is often hard to device any general and well functional pre-categorization that would fit the unforesee-able types of user recorded data. For compensating the absence of analysis tools for these classes we propose the use of efficient audio similarity measure and query-by-example algorithm with database clustering capabilities. The experimental results show that the combined use of the algorithms is feasible in practice.
    Original languageEnglish
    Title of host publicationMIR '08 Proceeding of the 1st ACM international conference on multimedia information retrieval
    PublisherAssociation for Computing Machinery ACM
    Pages113-120
    ISBN (Print)978-1-60558-312-9
    DOIs
    Publication statusPublished - 2008
    MoE publication typeA4 Article in a conference publication
    Event1st ACM International Conference on Multimedia Information Retrieval, MIR 2008 - Vancouver, Canada
    Duration: 30 Oct 201031 Oct 2010

    Conference

    Conference1st ACM International Conference on Multimedia Information Retrieval, MIR 2008
    Abbreviated titleMIR 2008
    Country/TerritoryCanada
    CityVancouver
    Period30/10/1031/10/10

    Keywords

    • Personal audio content management
    • audio classification
    • speaker segmentation
    • emotion detection
    • query-by-example

    Fingerprint

    Dive into the research topics of 'On Enabling Techniques for Personal Audio Content Management'. Together they form a unique fingerprint.

    Cite this