A topic model analysis of science and technology linkages: A case study in pharmaceutical industry

Samira Ranaei, Arho Suominen, Ozgur Dedehayir

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)

Abstract

Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of "Taxol", a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.
Original languageEnglish
Title of host publication2017 IEEE Technology & Engineering Management Conference (TEMSCON)
PublisherInstitute of Electrical and Electronic Engineers IEEE
Pages49-54
Number of pages6
ISBN (Electronic)978-1-5090-1114-8
ISBN (Print)978-1-5090-1115-5
DOIs
Publication statusPublished - 31 Jul 2017
MoE publication typeA4 Article in a conference publication
EventIEEE Technology & Engineering Management Conference, TEMSCON 2017 - Santa Clara, United States
Duration: 8 Jun 201710 Jun 2017

Conference

ConferenceIEEE Technology & Engineering Management Conference, TEMSCON 2017
Abbreviated titleTEMSCON 2017
CountryUnited States
CitySanta Clara
Period8/06/1710/06/17

Fingerprint

Drug products
Clustering algorithms
Learning algorithms
Learning systems
Industry
Semantics
Pharmaceutical industry
Linkage
Topic model
Patents
Scientific publications
Data base
Cancer
Dirichlet
Clustering algorithm
Qualitative analysis
Document clustering
Levels of analysis
Drugs
Learning algorithm

Keywords

  • patents
  • couplings
  • drugs
  • machine learning algorithms
  • algorithm design and analysis
  • classification algorithms
  • analytical models
  • topic modeling
  • technology management
  • taxol
  • machine learning
  • science and technology

Cite this

Ranaei, S., Suominen, A., & Dedehayir, O. (2017). A topic model analysis of science and technology linkages: A case study in pharmaceutical industry. In 2017 IEEE Technology & Engineering Management Conference (TEMSCON) (pp. 49-54). [7998353] Institute of Electrical and Electronic Engineers IEEE. https://doi.org/10.1109/TEMSCON.2017.7998353
Ranaei, Samira ; Suominen, Arho ; Dedehayir, Ozgur. / A topic model analysis of science and technology linkages : A case study in pharmaceutical industry. 2017 IEEE Technology & Engineering Management Conference (TEMSCON). Institute of Electrical and Electronic Engineers IEEE, 2017. pp. 49-54
@inproceedings{07a38444714b46af97e8e8d96853b783,
title = "A topic model analysis of science and technology linkages: A case study in pharmaceutical industry",
abstract = "Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of {"}Taxol{"}, a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.",
keywords = "patents, couplings, drugs, machine learning algorithms, algorithm design and analysis, classification algorithms, analytical models, topic modeling, technology management, taxol, machine learning, science and technology",
author = "Samira Ranaei and Arho Suominen and Ozgur Dedehayir",
year = "2017",
month = "7",
day = "31",
doi = "10.1109/TEMSCON.2017.7998353",
language = "English",
isbn = "978-1-5090-1115-5",
pages = "49--54",
booktitle = "2017 IEEE Technology & Engineering Management Conference (TEMSCON)",
publisher = "Institute of Electrical and Electronic Engineers IEEE",
address = "United States",

}

Ranaei, S, Suominen, A & Dedehayir, O 2017, A topic model analysis of science and technology linkages: A case study in pharmaceutical industry. in 2017 IEEE Technology & Engineering Management Conference (TEMSCON)., 7998353, Institute of Electrical and Electronic Engineers IEEE, pp. 49-54, IEEE Technology & Engineering Management Conference, TEMSCON 2017, Santa Clara, United States, 8/06/17. https://doi.org/10.1109/TEMSCON.2017.7998353

A topic model analysis of science and technology linkages : A case study in pharmaceutical industry. / Ranaei, Samira; Suominen, Arho; Dedehayir, Ozgur.

2017 IEEE Technology & Engineering Management Conference (TEMSCON). Institute of Electrical and Electronic Engineers IEEE, 2017. p. 49-54 7998353.

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

TY - GEN

T1 - A topic model analysis of science and technology linkages

T2 - A case study in pharmaceutical industry

AU - Ranaei, Samira

AU - Suominen, Arho

AU - Dedehayir, Ozgur

PY - 2017/7/31

Y1 - 2017/7/31

N2 - Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of "Taxol", a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.

AB - Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of "Taxol", a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.

KW - patents

KW - couplings

KW - drugs

KW - machine learning algorithms

KW - algorithm design and analysis

KW - classification algorithms

KW - analytical models

KW - topic modeling

KW - technology management

KW - taxol

KW - machine learning

KW - science and technology

UR - http://www.scopus.com/inward/record.url?scp=85028575629&partnerID=8YFLogxK

U2 - 10.1109/TEMSCON.2017.7998353

DO - 10.1109/TEMSCON.2017.7998353

M3 - Conference article in proceedings

SN - 978-1-5090-1115-5

SP - 49

EP - 54

BT - 2017 IEEE Technology & Engineering Management Conference (TEMSCON)

PB - Institute of Electrical and Electronic Engineers IEEE

ER -

Ranaei S, Suominen A, Dedehayir O. A topic model analysis of science and technology linkages: A case study in pharmaceutical industry. In 2017 IEEE Technology & Engineering Management Conference (TEMSCON). Institute of Electrical and Electronic Engineers IEEE. 2017. p. 49-54. 7998353 https://doi.org/10.1109/TEMSCON.2017.7998353