TRECVID 2003 experiments at MediaTeam Oulu and VTT

Mika Rautiainen, Jani Penttilä, Paavo Pietarila, Kai Noponen, Matti Hosio, Timo Koskela, Satu-Marja Mäkelä, Johannes Peltola, Jialin Liu, Timo Ojala, Tapio Seppänen

Research output: Contribution to conferenceConference articleScientificpeer-review


MediaTeam Oulu and VTT Technical Research Centre of Finland participated jointly in semantic feature extraction, manual search and interactive search tasks of TRECVID 2003. We participated to the semantic feature extraction by submitting results to 15 out of the 17 defined semantic categories. Our approach utilized spatio-temporal visual features based on correlations of quantized gradient edges and color values together with several physical features from the audio signal. Most recent version of our Video Browsing and Retrieval System (VIRE) contains an interactive cluster-temporal browser of video shots exploiting three semantic levels of similarity: visual, conceptual and lexical. The informativeness of the browser was enhanced by incorporating automatic speech transcription texts into the visual views based on shot key frames. The experimental results for interactive search task were obtained by conducting a user experiment of eight people with two system configurations: browsing by (I) visual features only (visual and conceptual browsing was allowed, no browsing with ASR text) or (II) visual features and ASR text (all semantic browsing levels were available and ASR-text content was visible). The interactive results using ASR-based features were better than the results using only visual features. This indicates the importance of successful integration of both visual and textual features for video browsing. In contrast to previous version of VIRE which performed early feature fusion by training unsupervised self-organizing maps, newest version capitalises on late fusion of features queries, which was evaluated in manual search task. This paper gives an overview of the developed system and summarises the results.
Original languageEnglish
Publication statusPublished - 2003
MoE publication typeNot Eligible
EventTRECVID Workshop at Text Retrieval Conference, TREC 2003 - Gaithersburg, United States
Duration: 1 Jan 20031 Jan 2003


ConferenceTRECVID Workshop at Text Retrieval Conference, TREC 2003
CountryUnited States

Fingerprint Dive into the research topics of 'TRECVID 2003 experiments at MediaTeam Oulu and VTT'. Together they form a unique fingerprint.

Cite this