We describe OAT, a new information extraction system under development. It extracts (subject, predicate, object) triplets from natural language texts. It uses ontologies extensively: the results are saved in an ontology, and ontologies are used in the information extraction process itself. It is adaptable both to a domain of discourse and within a domain of discourse (finding new concepts). This paper concentrates on the requirements and architecture of OAT.
|Title of host publication||Poster proceedings|
|Subtitle of host publication||Industrial Conference on Data Mining, ICDM 2006. Leipzig, DE, 13 - 14 July 2006.|
|Place of Publication||Leipzig|
|Publication status||Published - 2006|
|MoE publication type||Not Eligible|
- information extraction
- software requirements
- software architecture
Karanta, I., Pesonen, A., Seitsonen, L., & Silvonen, P. (2006). A text mining system for bioinformatics: requirements and architecture. In P. Perner (Ed.), Poster proceedings: Industrial Conference on Data Mining, ICDM 2006. Leipzig, DE, 13 - 14 July 2006. (pp. 225-229).