A topic model analysis of science and technology linkages: A case study in pharmaceutical industry

Samira Ranaei, Arho Suominen, Ozgur Dedehayir

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    1 Citation (Scopus)

    Abstract

    Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of "Taxol", a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.
    Original languageEnglish
    Title of host publication2017 IEEE Technology & Engineering Management Conference (TEMSCON)
    PublisherIEEE Institute of Electrical and Electronic Engineers
    Pages49-54
    Number of pages6
    ISBN (Electronic)978-1-5090-1114-8
    ISBN (Print)978-1-5090-1115-5
    DOIs
    Publication statusPublished - 31 Jul 2017
    MoE publication typeA4 Article in a conference publication
    EventIEEE Technology & Engineering Management Conference, TEMSCON 2017 - Santa Clara, United States
    Duration: 8 Jun 201710 Jun 2017

    Conference

    ConferenceIEEE Technology & Engineering Management Conference, TEMSCON 2017
    Abbreviated titleTEMSCON 2017
    CountryUnited States
    CitySanta Clara
    Period8/06/1710/06/17

    Fingerprint

    Drug products
    Clustering algorithms
    Learning algorithms
    Learning systems
    Industry
    Semantics
    Pharmaceutical industry
    Linkage
    Topic model
    Patents
    Scientific publications
    Data base
    Cancer
    Dirichlet
    Clustering algorithm
    Qualitative analysis
    Document clustering
    Levels of analysis
    Drugs
    Learning algorithm

    Keywords

    • patents
    • couplings
    • drugs
    • machine learning algorithms
    • algorithm design and analysis
    • classification algorithms
    • analytical models
    • topic modeling
    • technology management
    • taxol
    • machine learning
    • science and technology

    Cite this

    Ranaei, S., Suominen, A., & Dedehayir, O. (2017). A topic model analysis of science and technology linkages: A case study in pharmaceutical industry. In 2017 IEEE Technology & Engineering Management Conference (TEMSCON) (pp. 49-54). [7998353] IEEE Institute of Electrical and Electronic Engineers . https://doi.org/10.1109/TEMSCON.2017.7998353
    Ranaei, Samira ; Suominen, Arho ; Dedehayir, Ozgur. / A topic model analysis of science and technology linkages : A case study in pharmaceutical industry. 2017 IEEE Technology & Engineering Management Conference (TEMSCON). IEEE Institute of Electrical and Electronic Engineers , 2017. pp. 49-54
    @inproceedings{07a38444714b46af97e8e8d96853b783,
    title = "A topic model analysis of science and technology linkages: A case study in pharmaceutical industry",
    abstract = "Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of {"}Taxol{"}, a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.",
    keywords = "patents, couplings, drugs, machine learning algorithms, algorithm design and analysis, classification algorithms, analytical models, topic modeling, technology management, taxol, machine learning, science and technology",
    author = "Samira Ranaei and Arho Suominen and Ozgur Dedehayir",
    year = "2017",
    month = "7",
    day = "31",
    doi = "10.1109/TEMSCON.2017.7998353",
    language = "English",
    isbn = "978-1-5090-1115-5",
    pages = "49--54",
    booktitle = "2017 IEEE Technology & Engineering Management Conference (TEMSCON)",
    publisher = "IEEE Institute of Electrical and Electronic Engineers",
    address = "United States",

    }

    Ranaei, S, Suominen, A & Dedehayir, O 2017, A topic model analysis of science and technology linkages: A case study in pharmaceutical industry. in 2017 IEEE Technology & Engineering Management Conference (TEMSCON)., 7998353, IEEE Institute of Electrical and Electronic Engineers , pp. 49-54, IEEE Technology & Engineering Management Conference, TEMSCON 2017, Santa Clara, United States, 8/06/17. https://doi.org/10.1109/TEMSCON.2017.7998353

    A topic model analysis of science and technology linkages : A case study in pharmaceutical industry. / Ranaei, Samira; Suominen, Arho; Dedehayir, Ozgur.

    2017 IEEE Technology & Engineering Management Conference (TEMSCON). IEEE Institute of Electrical and Electronic Engineers , 2017. p. 49-54 7998353.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    TY - GEN

    T1 - A topic model analysis of science and technology linkages

    T2 - A case study in pharmaceutical industry

    AU - Ranaei, Samira

    AU - Suominen, Arho

    AU - Dedehayir, Ozgur

    PY - 2017/7/31

    Y1 - 2017/7/31

    N2 - Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of "Taxol", a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.

    AB - Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of "Taxol", a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.

    KW - patents

    KW - couplings

    KW - drugs

    KW - machine learning algorithms

    KW - algorithm design and analysis

    KW - classification algorithms

    KW - analytical models

    KW - topic modeling

    KW - technology management

    KW - taxol

    KW - machine learning

    KW - science and technology

    UR - http://www.scopus.com/inward/record.url?scp=85028575629&partnerID=8YFLogxK

    U2 - 10.1109/TEMSCON.2017.7998353

    DO - 10.1109/TEMSCON.2017.7998353

    M3 - Conference article in proceedings

    SN - 978-1-5090-1115-5

    SP - 49

    EP - 54

    BT - 2017 IEEE Technology & Engineering Management Conference (TEMSCON)

    PB - IEEE Institute of Electrical and Electronic Engineers

    ER -

    Ranaei S, Suominen A, Dedehayir O. A topic model analysis of science and technology linkages: A case study in pharmaceutical industry. In 2017 IEEE Technology & Engineering Management Conference (TEMSCON). IEEE Institute of Electrical and Electronic Engineers . 2017. p. 49-54. 7998353 https://doi.org/10.1109/TEMSCON.2017.7998353