Combining context and existing knowledge when recognizing biological entities - Early results

Mika Timonen, Antti Pesonen

Research output: Chapter in Book/Report/Conference proceedingChapter or book articleScientificpeer-review

1 Citation (Scopus)

Abstract

Entity recognition has been studied for several years with good results. However, as the focus of information extraction (IE) and entity recognition (ER) has been set on biology and bioinformatics, the existing methods do not produce as good results as before. This is mainly due to the complex naming conventions of biological entities. In our information extraction system for biomedical documents called OAT (Ontology Aided Text mining system) we developed our own method for recognizing the biological entities. The difference to the existing methods, which use lexicons, rules and statistics, is that we combine the context of the entity with the existing knowledge about the relationships of the entities. This has produced encouraging preliminary results. This paper describes the approach we are using in our information extraction system for entity recognition.
Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication12th Pacific-Asia Conference, PAKDD 2008 Osaka, Japan, May 20-23, 2008 Proceedings
EditorsTakashi Washio, Einoshin Suzuki, Kai Ming Ting, Akihiro Inokuchi
PublisherSpringer
Pages1028-1034
ISBN (Electronic)978-3-540-68125-0
ISBN (Print)978-3-540-68124-3
DOIs
Publication statusPublished - 2008
MoE publication typeA3 Part of a book or another research book
Event12th Pacific-Asia Conference, PAKDD 2008. Osaka, Japan, 20 - 23 May, 2008 -
Duration: 1 Jan 2008 → …

Publication series

SeriesLecture Notes in Computer Science
Volume5102
ISSN0302-9743

Conference

Conference12th Pacific-Asia Conference, PAKDD 2008. Osaka, Japan, 20 - 23 May, 2008
Period1/01/08 → …

Fingerprint

Bioinformatics
Ontology
Statistics

Keywords

  • entity recognition
  • entity classification
  • information extraction

Cite this

Timonen, M., & Pesonen, A. (2008). Combining context and existing knowledge when recognizing biological entities - Early results. In T. Washio, E. Suzuki, K. M. Ting, & A. Inokuchi (Eds.), Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Conference, PAKDD 2008 Osaka, Japan, May 20-23, 2008 Proceedings (pp. 1028-1034). Springer. Lecture Notes in Computer Science, Vol.. 5102 https://doi.org/10.1007/978-3-540-68125-0
Timonen, Mika ; Pesonen, Antti. / Combining context and existing knowledge when recognizing biological entities - Early results. Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Conference, PAKDD 2008 Osaka, Japan, May 20-23, 2008 Proceedings. editor / Takashi Washio ; Einoshin Suzuki ; Kai Ming Ting ; Akihiro Inokuchi. Springer, 2008. pp. 1028-1034 (Lecture Notes in Computer Science, Vol. 5102).
@inbook{08051f9a01574f988c772331b18b4865,
title = "Combining context and existing knowledge when recognizing biological entities - Early results",
abstract = "Entity recognition has been studied for several years with good results. However, as the focus of information extraction (IE) and entity recognition (ER) has been set on biology and bioinformatics, the existing methods do not produce as good results as before. This is mainly due to the complex naming conventions of biological entities. In our information extraction system for biomedical documents called OAT (Ontology Aided Text mining system) we developed our own method for recognizing the biological entities. The difference to the existing methods, which use lexicons, rules and statistics, is that we combine the context of the entity with the existing knowledge about the relationships of the entities. This has produced encouraging preliminary results. This paper describes the approach we are using in our information extraction system for entity recognition.",
keywords = "entity recognition, entity classification, information extraction",
author = "Mika Timonen and Antti Pesonen",
year = "2008",
doi = "10.1007/978-3-540-68125-0",
language = "English",
isbn = "978-3-540-68124-3",
series = "Lecture Notes in Computer Science",
publisher = "Springer",
pages = "1028--1034",
editor = "Takashi Washio and Einoshin Suzuki and Ting, {Kai Ming} and Akihiro Inokuchi",
booktitle = "Advances in Knowledge Discovery and Data Mining",
address = "Germany",

}

Timonen, M & Pesonen, A 2008, Combining context and existing knowledge when recognizing biological entities - Early results. in T Washio, E Suzuki, KM Ting & A Inokuchi (eds), Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Conference, PAKDD 2008 Osaka, Japan, May 20-23, 2008 Proceedings. Springer, Lecture Notes in Computer Science, vol. 5102, pp. 1028-1034, 12th Pacific-Asia Conference, PAKDD 2008. Osaka, Japan, 20 - 23 May, 2008, 1/01/08. https://doi.org/10.1007/978-3-540-68125-0

Combining context and existing knowledge when recognizing biological entities - Early results. / Timonen, Mika; Pesonen, Antti.

Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Conference, PAKDD 2008 Osaka, Japan, May 20-23, 2008 Proceedings. ed. / Takashi Washio; Einoshin Suzuki; Kai Ming Ting; Akihiro Inokuchi. Springer, 2008. p. 1028-1034 (Lecture Notes in Computer Science, Vol. 5102).

Research output: Chapter in Book/Report/Conference proceedingChapter or book articleScientificpeer-review

TY - CHAP

T1 - Combining context and existing knowledge when recognizing biological entities - Early results

AU - Timonen, Mika

AU - Pesonen, Antti

PY - 2008

Y1 - 2008

N2 - Entity recognition has been studied for several years with good results. However, as the focus of information extraction (IE) and entity recognition (ER) has been set on biology and bioinformatics, the existing methods do not produce as good results as before. This is mainly due to the complex naming conventions of biological entities. In our information extraction system for biomedical documents called OAT (Ontology Aided Text mining system) we developed our own method for recognizing the biological entities. The difference to the existing methods, which use lexicons, rules and statistics, is that we combine the context of the entity with the existing knowledge about the relationships of the entities. This has produced encouraging preliminary results. This paper describes the approach we are using in our information extraction system for entity recognition.

AB - Entity recognition has been studied for several years with good results. However, as the focus of information extraction (IE) and entity recognition (ER) has been set on biology and bioinformatics, the existing methods do not produce as good results as before. This is mainly due to the complex naming conventions of biological entities. In our information extraction system for biomedical documents called OAT (Ontology Aided Text mining system) we developed our own method for recognizing the biological entities. The difference to the existing methods, which use lexicons, rules and statistics, is that we combine the context of the entity with the existing knowledge about the relationships of the entities. This has produced encouraging preliminary results. This paper describes the approach we are using in our information extraction system for entity recognition.

KW - entity recognition

KW - entity classification

KW - information extraction

U2 - 10.1007/978-3-540-68125-0

DO - 10.1007/978-3-540-68125-0

M3 - Chapter or book article

SN - 978-3-540-68124-3

T3 - Lecture Notes in Computer Science

SP - 1028

EP - 1034

BT - Advances in Knowledge Discovery and Data Mining

A2 - Washio, Takashi

A2 - Suzuki, Einoshin

A2 - Ting, Kai Ming

A2 - Inokuchi, Akihiro

PB - Springer

ER -

Timonen M, Pesonen A. Combining context and existing knowledge when recognizing biological entities - Early results. In Washio T, Suzuki E, Ting KM, Inokuchi A, editors, Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Conference, PAKDD 2008 Osaka, Japan, May 20-23, 2008 Proceedings. Springer. 2008. p. 1028-1034. (Lecture Notes in Computer Science, Vol. 5102). https://doi.org/10.1007/978-3-540-68125-0