TY - BOOK
T1 - Data Mining Tools for Technology and Competitive Intelligence
AU - Ruotsalainen, Laura
PY - 2008
Y1 - 2008
N2 - Approximately 80% of scientific and technical information
can be found from patent documents alone, according to a
study carried out by the European Patent Office. Patents
are also a unique source of information since they are
collected, screened and published according to
internationally agreed standards. In addition to being an
extremely valuable source of technology intelligence,
patent documents offer a business competitive
intelligence by revealing a competitor's strengths and
strategies. Information gained from patents can also help
in locating partners for cross-licensing and
collaboration.
Since the patent system was established, more than 60
million patent applications have been published. It would
be impossible to find and analyze relevant documents
manually. The need for analysis and evaluation tools for
patents has been acknowledged by many solution providers.
New solutions are continuously coming onto the market;
tools for reading and evaluating individual patents and
tools for analyzing sets of patent documents. Solutions
of the latter type can still be roughly divided into two
groups: tools for retrieving and preparing basic
statistics for patent documents, and tools for
visualization and progressive analysis of patents. The
former group deals only with data in a structured form,
whereas the latter also analyzes unstructured text and
other data.
In this study, four efficient tools for analyzing patent
documents were tested: Thomson Reuter's Aureka and
Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN
AnaVist. All four tools analyze structured and
unstructured data alike. They all visualize the results
achieved from clustering the text fields of patent
documents and either provide basic statistics graphs
themselves or contain filters for performing them with
other solutions.
The tools were tested with two cases, evaluating their
ability to offer technology and business intelligence
from patent documents for companies' daily business.
Being aware of the state of the art of relevant
technology areas is crucial for a company's innovation
process. Knowledge of developed techniques and products
forestalls overlapping R&D projects and thereby prevents
unnecessary investment. Equally important is the
recognition of other actors operating in the field.
Benchmarking and evaluating a competitor's R&D and market
strategies aids in managing one's own processes and
locating possible parties for collaboration or
cross-licensing.
This study took the point of view of a patent analyst
with a basic understanding of patent data but no special
knowledge of data mining techniques or the tools tested.
All the tools evaluated are very useful for the task and
quite easy to adopt for daily work. All four had some
strengths and weaknesses in comparison to each other. As
a conclusion it could be stated that OmniViz and Thomson
Data Analyzer are tools for sophisticated and diversified
mathematical analysis of the data. Aureka and AnaVist are
convenient for easily visualizing basic statistics and
"top lists" of the data and for making stylish patent
maps. The unique features of OmniViz, when compared to
the other tools tested, are the possibility to visualize
clustered data from many different points of view and the
possibility to evaluate some attributes with patent map
animations. Thomson Data Analyzer offers efficient tools
for comparing different subsets of the data, e.g. for
identifying unique values of an attribute. Aureka is the
only tool to allow citation analyses and has the most
illustrative patent map. STN AnaVist is superior in the
possibility to retrieve basic statistics fast and
smoothly.
The results obtained with all four tools were very much
alike, even though different databases for retrieving the
data were used. The top assignees and inventors lists
were uniform, as were the year trends and both
technological and geographical business areas. Only the
reciprocal orders and amounts of documents varied.
However, the conclusions drawn from the results, and
business decisions made with them, would all be similar
regardless of the tool used.
AB - Approximately 80% of scientific and technical information
can be found from patent documents alone, according to a
study carried out by the European Patent Office. Patents
are also a unique source of information since they are
collected, screened and published according to
internationally agreed standards. In addition to being an
extremely valuable source of technology intelligence,
patent documents offer a business competitive
intelligence by revealing a competitor's strengths and
strategies. Information gained from patents can also help
in locating partners for cross-licensing and
collaboration.
Since the patent system was established, more than 60
million patent applications have been published. It would
be impossible to find and analyze relevant documents
manually. The need for analysis and evaluation tools for
patents has been acknowledged by many solution providers.
New solutions are continuously coming onto the market;
tools for reading and evaluating individual patents and
tools for analyzing sets of patent documents. Solutions
of the latter type can still be roughly divided into two
groups: tools for retrieving and preparing basic
statistics for patent documents, and tools for
visualization and progressive analysis of patents. The
former group deals only with data in a structured form,
whereas the latter also analyzes unstructured text and
other data.
In this study, four efficient tools for analyzing patent
documents were tested: Thomson Reuter's Aureka and
Thomson Data Analyzer, Biowisdom's OmniViz, and STN's STN
AnaVist. All four tools analyze structured and
unstructured data alike. They all visualize the results
achieved from clustering the text fields of patent
documents and either provide basic statistics graphs
themselves or contain filters for performing them with
other solutions.
The tools were tested with two cases, evaluating their
ability to offer technology and business intelligence
from patent documents for companies' daily business.
Being aware of the state of the art of relevant
technology areas is crucial for a company's innovation
process. Knowledge of developed techniques and products
forestalls overlapping R&D projects and thereby prevents
unnecessary investment. Equally important is the
recognition of other actors operating in the field.
Benchmarking and evaluating a competitor's R&D and market
strategies aids in managing one's own processes and
locating possible parties for collaboration or
cross-licensing.
This study took the point of view of a patent analyst
with a basic understanding of patent data but no special
knowledge of data mining techniques or the tools tested.
All the tools evaluated are very useful for the task and
quite easy to adopt for daily work. All four had some
strengths and weaknesses in comparison to each other. As
a conclusion it could be stated that OmniViz and Thomson
Data Analyzer are tools for sophisticated and diversified
mathematical analysis of the data. Aureka and AnaVist are
convenient for easily visualizing basic statistics and
"top lists" of the data and for making stylish patent
maps. The unique features of OmniViz, when compared to
the other tools tested, are the possibility to visualize
clustered data from many different points of view and the
possibility to evaluate some attributes with patent map
animations. Thomson Data Analyzer offers efficient tools
for comparing different subsets of the data, e.g. for
identifying unique values of an attribute. Aureka is the
only tool to allow citation analyses and has the most
illustrative patent map. STN AnaVist is superior in the
possibility to retrieve basic statistics fast and
smoothly.
The results obtained with all four tools were very much
alike, even though different databases for retrieving the
data were used. The top assignees and inventors lists
were uniform, as were the year trends and both
technological and geographical business areas. Only the
reciprocal orders and amounts of documents varied.
However, the conclusions drawn from the results, and
business decisions made with them, would all be similar
regardless of the tool used.
KW - patent data
KW - text mining
KW - data mining
KW - patent mining
KW - patent mapping
KW - competitive intelligence
KW - technology intelligence
KW - visualization
M3 - Report
SN - 978-951-38-7240-3
T3 - VTT Tiedotteita - Meddelanden - Research Notes
BT - Data Mining Tools for Technology and Competitive Intelligence
PB - VTT Technical Research Centre of Finland
CY - Espoo
ER -