Data Mining Tools for Technology and Competitive Intelligence

Laura Ruotsalainen

Research output: Book/ReportReport

5 Citations (Scopus)


Approximately 80% of scientific and technical information can be found from patent documents alone, according to a study carried out by the European Patent Office. Patents are also a unique source of information since they are collected, screened and published according to internationally agreed standards. In addition to being an extremely valuable source of technology intelligence, patent documents offer a business competitive intelligence by revealing a competitor's strengths and strategies. Information gained from patents can also help in locating partners for cross-licensing and collaboration. Since the patent system was established, more than 60 million patent applications have been published. It would be impossible to find and analyze relevant documents manually. The need for analysis and evaluation tools for patents has been acknowledged by many solution providers. New solutions are continuously coming onto the market; tools for reading and evaluating individual patents and tools for analyzing sets of patent documents. Solutions of the latter type can still be roughly divided into two groups: tools for retrieving and preparing basic statistics for patent documents, and tools for visualization and progressive analysis of patents. The former group deals only with data in a structured form, whereas the latter also analyzes unstructured text and other data. In this study, four efficient tools for analyzing patent documents were tested: Thomson Reuter's Aureka and Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN AnaVist. All four tools analyze structured and unstructured data alike. They all visualize the results achieved from clustering the text fields of patent documents and either provide basic statistics graphs themselves or contain filters for performing them with other solutions. The tools were tested with two cases, evaluating their ability to offer technology and business intelligence from patent documents for companies' daily business. Being aware of the state of the art of relevant technology areas is crucial for a company's innovation process. Knowledge of developed techniques and products forestalls overlapping R&D projects and thereby prevents unnecessary investment. Equally important is the recognition of other actors operating in the field. Benchmarking and evaluating a competitor's R&D and market strategies aids in managing one's own processes and locating possible parties for collaboration or cross-licensing. This study took the point of view of a patent analyst with a basic understanding of patent data but no special knowledge of data mining techniques or the tools tested. All the tools evaluated are very useful for the task and quite easy to adopt for daily work. All four had some strengths and weaknesses in comparison to each other. As a conclusion it could be stated that OmniViz and Thomson Data Analyzer are tools for sophisticated and diversified mathematical analysis of the data. Aureka and AnaVist are convenient for easily visualizing basic statistics and "top lists" of the data and for making stylish patent maps. The unique features of OmniViz, when compared to the other tools tested, are the possibility to visualize clustered data from many different points of view and the possibility to evaluate some attributes with patent map animations. Thomson Data Analyzer offers efficient tools for comparing different subsets of the data, e.g. for identifying unique values of an attribute. Aureka is the only tool to allow citation analyses and has the most illustrative patent map. STN AnaVist is superior in the possibility to retrieve basic statistics fast and smoothly. The results obtained with all four tools were very much alike, even though different databases for retrieving the data were used. The top assignees and inventors lists were uniform, as were the year trends and both technological and geographical business areas. Only the reciprocal orders and amounts of documents varied. However, the conclusions drawn from the results, and business decisions made with them, would all be similar regardless of the tool used.
Original languageEnglish
Place of PublicationEspoo
PublisherVTT Technical Research Centre of Finland
Number of pages68
ISBN (Electronic)978-951-38-7241-0
ISBN (Print)978-951-38-7240-3
Publication statusPublished - 2008
MoE publication typeNot Eligible

Publication series

SeriesVTT Tiedotteita - Meddelanden - Research Notes


  • patent data
  • text mining
  • data mining
  • patent mining
  • patent mapping
  • competitive intelligence
  • technology intelligence
  • visualization


Dive into the research topics of 'Data Mining Tools for Technology and Competitive Intelligence'. Together they form a unique fingerprint.

Cite this