Data Mining Tools for Technology and Competitive Intelligence

Laura Ruotsalainen

Research output: Book/ReportReportProfessional

4 Citations (Scopus)

Abstract

Approximately 80% of scientific and technical information can be found from patent documents alone, according to a study carried out by the European Patent Office. Patents are also a unique source of information since they are collected, screened and published according to internationally agreed standards. In addition to being an extremely valuable source of technology intelligence, patent documents offer a business competitive intelligence by revealing a competitor's strengths and strategies. Information gained from patents can also help in locating partners for cross-licensing and collaboration. Since the patent system was established, more than 60 million patent applications have been published. It would be impossible to find and analyze relevant documents manually. The need for analysis and evaluation tools for patents has been acknowledged by many solution providers. New solutions are continuously coming onto the market; tools for reading and evaluating individual patents and tools for analyzing sets of patent documents. Solutions of the latter type can still be roughly divided into two groups: tools for retrieving and preparing basic statistics for patent documents, and tools for visualization and progressive analysis of patents. The former group deals only with data in a structured form, whereas the latter also analyzes unstructured text and other data. In this study, four efficient tools for analyzing patent documents were tested: Thomson Reuter's Aureka and Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN AnaVist. All four tools analyze structured and unstructured data alike. They all visualize the results achieved from clustering the text fields of patent documents and either provide basic statistics graphs themselves or contain filters for performing them with other solutions. The tools were tested with two cases, evaluating their ability to offer technology and business intelligence from patent documents for companies' daily business. Being aware of the state of the art of relevant technology areas is crucial for a company's innovation process. Knowledge of developed techniques and products forestalls overlapping R&D projects and thereby prevents unnecessary investment. Equally important is the recognition of other actors operating in the field. Benchmarking and evaluating a competitor's R&D and market strategies aids in managing one's own processes and locating possible parties for collaboration or cross-licensing. This study took the point of view of a patent analyst with a basic understanding of patent data but no special knowledge of data mining techniques or the tools tested. All the tools evaluated are very useful for the task and quite easy to adopt for daily work. All four had some strengths and weaknesses in comparison to each other. As a conclusion it could be stated that OmniViz and Thomson Data Analyzer are tools for sophisticated and diversified mathematical analysis of the data. Aureka and AnaVist are convenient for easily visualizing basic statistics and "top lists" of the data and for making stylish patent maps. The unique features of OmniViz, when compared to the other tools tested, are the possibility to visualize clustered data from many different points of view and the possibility to evaluate some attributes with patent map animations. Thomson Data Analyzer offers efficient tools for comparing different subsets of the data, e.g. for identifying unique values of an attribute. Aureka is the only tool to allow citation analyses and has the most illustrative patent map. STN AnaVist is superior in the possibility to retrieve basic statistics fast and smoothly. The results obtained with all four tools were very much alike, even though different databases for retrieving the data were used. The top assignees and inventors lists were uniform, as were the year trends and both technological and geographical business areas. Only the reciprocal orders and amounts of documents varied. However, the conclusions drawn from the results, and business decisions made with them, would all be similar regardless of the tool used.
Original languageEnglish
Place of PublicationEspoo
PublisherVTT Technical Research Centre of Finland
Number of pages68
ISBN (Electronic)978-951-38-7241-0
ISBN (Print)978-951-38-7240-3
Publication statusPublished - 2008
MoE publication typeNot Eligible

Publication series

NameVTT Tiedotteita - Research Notes
PublisherVTT
No.2451
ISSN (Print)1235-0605
ISSN (Electronic)1455-0865

Fingerprint

Competitive intelligence
Data mining
Statistics
Industry
Benchmarking
Animation

Keywords

  • patent data
  • text mining
  • data mining
  • patent mining
  • patent mapping
  • competitive intelligence
  • technology intelligence
  • visualization

Cite this

Ruotsalainen, L. (2008). Data Mining Tools for Technology and Competitive Intelligence. Espoo: VTT Technical Research Centre of Finland. VTT Tiedotteita - Meddelanden - Research Notes, No. 2451
Ruotsalainen, Laura. / Data Mining Tools for Technology and Competitive Intelligence. Espoo : VTT Technical Research Centre of Finland, 2008. 68 p. (VTT Tiedotteita - Meddelanden - Research Notes; No. 2451).
@book{0663e35061024e68a932157273e8ec30,
title = "Data Mining Tools for Technology and Competitive Intelligence",
abstract = "Approximately 80{\%} of scientific and technical information can be found from patent documents alone, according to a study carried out by the European Patent Office. Patents are also a unique source of information since they are collected, screened and published according to internationally agreed standards. In addition to being an extremely valuable source of technology intelligence, patent documents offer a business competitive intelligence by revealing a competitor's strengths and strategies. Information gained from patents can also help in locating partners for cross-licensing and collaboration. Since the patent system was established, more than 60 million patent applications have been published. It would be impossible to find and analyze relevant documents manually. The need for analysis and evaluation tools for patents has been acknowledged by many solution providers. New solutions are continuously coming onto the market; tools for reading and evaluating individual patents and tools for analyzing sets of patent documents. Solutions of the latter type can still be roughly divided into two groups: tools for retrieving and preparing basic statistics for patent documents, and tools for visualization and progressive analysis of patents. The former group deals only with data in a structured form, whereas the latter also analyzes unstructured text and other data. In this study, four efficient tools for analyzing patent documents were tested: Thomson Reuter's Aureka and Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN AnaVist. All four tools analyze structured and unstructured data alike. They all visualize the results achieved from clustering the text fields of patent documents and either provide basic statistics graphs themselves or contain filters for performing them with other solutions. The tools were tested with two cases, evaluating their ability to offer technology and business intelligence from patent documents for companies' daily business. Being aware of the state of the art of relevant technology areas is crucial for a company's innovation process. Knowledge of developed techniques and products forestalls overlapping R&D projects and thereby prevents unnecessary investment. Equally important is the recognition of other actors operating in the field. Benchmarking and evaluating a competitor's R&D and market strategies aids in managing one's own processes and locating possible parties for collaboration or cross-licensing. This study took the point of view of a patent analyst with a basic understanding of patent data but no special knowledge of data mining techniques or the tools tested. All the tools evaluated are very useful for the task and quite easy to adopt for daily work. All four had some strengths and weaknesses in comparison to each other. As a conclusion it could be stated that OmniViz and Thomson Data Analyzer are tools for sophisticated and diversified mathematical analysis of the data. Aureka and AnaVist are convenient for easily visualizing basic statistics and {"}top lists{"} of the data and for making stylish patent maps. The unique features of OmniViz, when compared to the other tools tested, are the possibility to visualize clustered data from many different points of view and the possibility to evaluate some attributes with patent map animations. Thomson Data Analyzer offers efficient tools for comparing different subsets of the data, e.g. for identifying unique values of an attribute. Aureka is the only tool to allow citation analyses and has the most illustrative patent map. STN AnaVist is superior in the possibility to retrieve basic statistics fast and smoothly. The results obtained with all four tools were very much alike, even though different databases for retrieving the data were used. The top assignees and inventors lists were uniform, as were the year trends and both technological and geographical business areas. Only the reciprocal orders and amounts of documents varied. However, the conclusions drawn from the results, and business decisions made with them, would all be similar regardless of the tool used.",
keywords = "patent data, text mining, data mining, patent mining, patent mapping, competitive intelligence, technology intelligence, visualization",
author = "Laura Ruotsalainen",
year = "2008",
language = "English",
isbn = "978-951-38-7240-3",
series = "VTT Tiedotteita - Research Notes",
publisher = "VTT Technical Research Centre of Finland",
number = "2451",
address = "Finland",

}

Ruotsalainen, L 2008, Data Mining Tools for Technology and Competitive Intelligence. VTT Tiedotteita - Meddelanden - Research Notes, no. 2451, VTT Technical Research Centre of Finland, Espoo.

Data Mining Tools for Technology and Competitive Intelligence. / Ruotsalainen, Laura.

Espoo : VTT Technical Research Centre of Finland, 2008. 68 p. (VTT Tiedotteita - Meddelanden - Research Notes; No. 2451).

Research output: Book/ReportReportProfessional

TY - BOOK

T1 - Data Mining Tools for Technology and Competitive Intelligence

AU - Ruotsalainen, Laura

PY - 2008

Y1 - 2008

N2 - Approximately 80% of scientific and technical information can be found from patent documents alone, according to a study carried out by the European Patent Office. Patents are also a unique source of information since they are collected, screened and published according to internationally agreed standards. In addition to being an extremely valuable source of technology intelligence, patent documents offer a business competitive intelligence by revealing a competitor's strengths and strategies. Information gained from patents can also help in locating partners for cross-licensing and collaboration. Since the patent system was established, more than 60 million patent applications have been published. It would be impossible to find and analyze relevant documents manually. The need for analysis and evaluation tools for patents has been acknowledged by many solution providers. New solutions are continuously coming onto the market; tools for reading and evaluating individual patents and tools for analyzing sets of patent documents. Solutions of the latter type can still be roughly divided into two groups: tools for retrieving and preparing basic statistics for patent documents, and tools for visualization and progressive analysis of patents. The former group deals only with data in a structured form, whereas the latter also analyzes unstructured text and other data. In this study, four efficient tools for analyzing patent documents were tested: Thomson Reuter's Aureka and Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN AnaVist. All four tools analyze structured and unstructured data alike. They all visualize the results achieved from clustering the text fields of patent documents and either provide basic statistics graphs themselves or contain filters for performing them with other solutions. The tools were tested with two cases, evaluating their ability to offer technology and business intelligence from patent documents for companies' daily business. Being aware of the state of the art of relevant technology areas is crucial for a company's innovation process. Knowledge of developed techniques and products forestalls overlapping R&D projects and thereby prevents unnecessary investment. Equally important is the recognition of other actors operating in the field. Benchmarking and evaluating a competitor's R&D and market strategies aids in managing one's own processes and locating possible parties for collaboration or cross-licensing. This study took the point of view of a patent analyst with a basic understanding of patent data but no special knowledge of data mining techniques or the tools tested. All the tools evaluated are very useful for the task and quite easy to adopt for daily work. All four had some strengths and weaknesses in comparison to each other. As a conclusion it could be stated that OmniViz and Thomson Data Analyzer are tools for sophisticated and diversified mathematical analysis of the data. Aureka and AnaVist are convenient for easily visualizing basic statistics and "top lists" of the data and for making stylish patent maps. The unique features of OmniViz, when compared to the other tools tested, are the possibility to visualize clustered data from many different points of view and the possibility to evaluate some attributes with patent map animations. Thomson Data Analyzer offers efficient tools for comparing different subsets of the data, e.g. for identifying unique values of an attribute. Aureka is the only tool to allow citation analyses and has the most illustrative patent map. STN AnaVist is superior in the possibility to retrieve basic statistics fast and smoothly. The results obtained with all four tools were very much alike, even though different databases for retrieving the data were used. The top assignees and inventors lists were uniform, as were the year trends and both technological and geographical business areas. Only the reciprocal orders and amounts of documents varied. However, the conclusions drawn from the results, and business decisions made with them, would all be similar regardless of the tool used.

AB - Approximately 80% of scientific and technical information can be found from patent documents alone, according to a study carried out by the European Patent Office. Patents are also a unique source of information since they are collected, screened and published according to internationally agreed standards. In addition to being an extremely valuable source of technology intelligence, patent documents offer a business competitive intelligence by revealing a competitor's strengths and strategies. Information gained from patents can also help in locating partners for cross-licensing and collaboration. Since the patent system was established, more than 60 million patent applications have been published. It would be impossible to find and analyze relevant documents manually. The need for analysis and evaluation tools for patents has been acknowledged by many solution providers. New solutions are continuously coming onto the market; tools for reading and evaluating individual patents and tools for analyzing sets of patent documents. Solutions of the latter type can still be roughly divided into two groups: tools for retrieving and preparing basic statistics for patent documents, and tools for visualization and progressive analysis of patents. The former group deals only with data in a structured form, whereas the latter also analyzes unstructured text and other data. In this study, four efficient tools for analyzing patent documents were tested: Thomson Reuter's Aureka and Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN AnaVist. All four tools analyze structured and unstructured data alike. They all visualize the results achieved from clustering the text fields of patent documents and either provide basic statistics graphs themselves or contain filters for performing them with other solutions. The tools were tested with two cases, evaluating their ability to offer technology and business intelligence from patent documents for companies' daily business. Being aware of the state of the art of relevant technology areas is crucial for a company's innovation process. Knowledge of developed techniques and products forestalls overlapping R&D projects and thereby prevents unnecessary investment. Equally important is the recognition of other actors operating in the field. Benchmarking and evaluating a competitor's R&D and market strategies aids in managing one's own processes and locating possible parties for collaboration or cross-licensing. This study took the point of view of a patent analyst with a basic understanding of patent data but no special knowledge of data mining techniques or the tools tested. All the tools evaluated are very useful for the task and quite easy to adopt for daily work. All four had some strengths and weaknesses in comparison to each other. As a conclusion it could be stated that OmniViz and Thomson Data Analyzer are tools for sophisticated and diversified mathematical analysis of the data. Aureka and AnaVist are convenient for easily visualizing basic statistics and "top lists" of the data and for making stylish patent maps. The unique features of OmniViz, when compared to the other tools tested, are the possibility to visualize clustered data from many different points of view and the possibility to evaluate some attributes with patent map animations. Thomson Data Analyzer offers efficient tools for comparing different subsets of the data, e.g. for identifying unique values of an attribute. Aureka is the only tool to allow citation analyses and has the most illustrative patent map. STN AnaVist is superior in the possibility to retrieve basic statistics fast and smoothly. The results obtained with all four tools were very much alike, even though different databases for retrieving the data were used. The top assignees and inventors lists were uniform, as were the year trends and both technological and geographical business areas. Only the reciprocal orders and amounts of documents varied. However, the conclusions drawn from the results, and business decisions made with them, would all be similar regardless of the tool used.

KW - patent data

KW - text mining

KW - data mining

KW - patent mining

KW - patent mapping

KW - competitive intelligence

KW - technology intelligence

KW - visualization

M3 - Report

SN - 978-951-38-7240-3

T3 - VTT Tiedotteita - Research Notes

BT - Data Mining Tools for Technology and Competitive Intelligence

PB - VTT Technical Research Centre of Finland

CY - Espoo

ER -

Ruotsalainen L. Data Mining Tools for Technology and Competitive Intelligence. Espoo: VTT Technical Research Centre of Finland, 2008. 68 p. (VTT Tiedotteita - Meddelanden - Research Notes; No. 2451).