Quality management architecture for social media data

Pekka Pääkkönen (Corresponding Author), Juha Jokitulppo

    Research output: Contribution to journalArticleScientificpeer-review

    13 Citations (Scopus)


    Social media data has provided various insights into the behaviour of consumers and businesses. However, extracted data may be erroneous, or could have originated from a malicious source. Thus, quality of social media should be managed. Also, it should be understood how data quality can be managed across a big data pipeline, which may consist of several processing and analysis phases. The contribution of this paper is evaluation of data quality management architecture for social media data. The theoretical concepts based on previous work have been implemented for data quality evaluation of Twitter-based data sets. Particularly, reference architecture for quality management in social media data has been extended and evaluated based on the implementation architecture. Experiments indicate that 150-800 tweets/s can be evaluated with two cloud nodes depending on the configuration.
    Original languageEnglish
    Article number6
    Number of pages26
    JournalJournal of Big Data
    Issue number6
    Publication statusPublished - 1 Dec 2017
    MoE publication typeA1 Journal article-refereed


    • quality attribute
    • quality metric
    • quality policy
    • spark
    • Cassandra
    • Word2Vec


    Dive into the research topics of 'Quality management architecture for social media data'. Together they form a unique fingerprint.

    Cite this