No landslide for the human journalist: An empirical study of computer-generated election news in Finland

Magnus Melin (Corresponding Author), Asta Bäck, Caj Södergård, Myriam Munezero, Leo Leppänen

Research output: Contribution to journalArticleScientificpeer-review

1 Citation (Scopus)

Abstract

In an age of struggling news media, automated generation of news via Natural Language Generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the Finnish municipal elections of 2017. To evaluate the quality of Valtteri-produced articles, and to identify aspects to improve, n=152 users were asked to evaluate the output of Valtteri. Each evaluator rated six preselected computer-generated articles, four control articles written by journalists, and four computer-generated articles of their own choice. All the articles were evaluated along four dimensions: credibility, liking, quality and representativeness. As expected, the texts written by Valtteri received lower ratings than those written by journalists, but overall the ratings were satisfactory (avg. 2.9 vs. 4.0 for journalists on a 5-point scale). Valtteri´s best rating (3.6) was for credibility. The computer-written articles that the evaluators could freely select got slightly better ratings than the preselected computer-written articles. When looking at the results by demographic groups, males aged 55 or more liked the automatic articles best and females aged 34 or less liked them the least. Evaluators mistook 21% of the computer-written articles as written by humans, and 10% of the human-written articles as computer-written. The share of users making these mistakes grew with the age. Overall, male evaluators made less writer-identification mistakes than female evaluators did.
Original languageEnglish
Article number8424161
Pages (from-to)43356-43367
Number of pages12
JournalIEEE Access
Volume6
Early online date1 Aug 2018
DOIs
Publication statusPublished - 1 Aug 2018
MoE publication typeNot Eligible

Fingerprint

Landslides
Automation

Keywords

  • Artificial intelligence
  • Automated content generation
  • Automated storytelling
  • Natural language processing
  • Robot journalism

Cite this

@article{8d2f47f8c40f48629d1356191ff2d21a,
title = "No landslide for the human journalist: An empirical study of computer-generated election news in Finland",
abstract = "In an age of struggling news media, automated generation of news via Natural Language Generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the Finnish municipal elections of 2017. To evaluate the quality of Valtteri-produced articles, and to identify aspects to improve, n=152 users were asked to evaluate the output of Valtteri. Each evaluator rated six preselected computer-generated articles, four control articles written by journalists, and four computer-generated articles of their own choice. All the articles were evaluated along four dimensions: credibility, liking, quality and representativeness. As expected, the texts written by Valtteri received lower ratings than those written by journalists, but overall the ratings were satisfactory (avg. 2.9 vs. 4.0 for journalists on a 5-point scale). Valtteri´s best rating (3.6) was for credibility. The computer-written articles that the evaluators could freely select got slightly better ratings than the preselected computer-written articles. When looking at the results by demographic groups, males aged 55 or more liked the automatic articles best and females aged 34 or less liked them the least. Evaluators mistook 21{\%} of the computer-written articles as written by humans, and 10{\%} of the human-written articles as computer-written. The share of users making these mistakes grew with the age. Overall, male evaluators made less writer-identification mistakes than female evaluators did.",
keywords = "Artificial intelligence, Automated content generation, Automated storytelling, Natural language processing, Robot journalism",
author = "Magnus Melin and Asta B{\"a}ck and Caj S{\"o}derg{\aa}rd and Myriam Munezero and Leo Lepp{\"a}nen",
year = "2018",
month = "8",
day = "1",
doi = "10.1109/ACCESS.2018.2861987",
language = "English",
volume = "6",
pages = "43356--43367",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronic Engineers IEEE",

}

No landslide for the human journalist : An empirical study of computer-generated election news in Finland. / Melin, Magnus (Corresponding Author); Bäck, Asta; Södergård, Caj; Munezero, Myriam; Leppänen, Leo.

In: IEEE Access, Vol. 6, 8424161, 01.08.2018, p. 43356-43367.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - No landslide for the human journalist

T2 - An empirical study of computer-generated election news in Finland

AU - Melin, Magnus

AU - Bäck, Asta

AU - Södergård, Caj

AU - Munezero, Myriam

AU - Leppänen, Leo

PY - 2018/8/1

Y1 - 2018/8/1

N2 - In an age of struggling news media, automated generation of news via Natural Language Generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the Finnish municipal elections of 2017. To evaluate the quality of Valtteri-produced articles, and to identify aspects to improve, n=152 users were asked to evaluate the output of Valtteri. Each evaluator rated six preselected computer-generated articles, four control articles written by journalists, and four computer-generated articles of their own choice. All the articles were evaluated along four dimensions: credibility, liking, quality and representativeness. As expected, the texts written by Valtteri received lower ratings than those written by journalists, but overall the ratings were satisfactory (avg. 2.9 vs. 4.0 for journalists on a 5-point scale). Valtteri´s best rating (3.6) was for credibility. The computer-written articles that the evaluators could freely select got slightly better ratings than the preselected computer-written articles. When looking at the results by demographic groups, males aged 55 or more liked the automatic articles best and females aged 34 or less liked them the least. Evaluators mistook 21% of the computer-written articles as written by humans, and 10% of the human-written articles as computer-written. The share of users making these mistakes grew with the age. Overall, male evaluators made less writer-identification mistakes than female evaluators did.

AB - In an age of struggling news media, automated generation of news via Natural Language Generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the Finnish municipal elections of 2017. To evaluate the quality of Valtteri-produced articles, and to identify aspects to improve, n=152 users were asked to evaluate the output of Valtteri. Each evaluator rated six preselected computer-generated articles, four control articles written by journalists, and four computer-generated articles of their own choice. All the articles were evaluated along four dimensions: credibility, liking, quality and representativeness. As expected, the texts written by Valtteri received lower ratings than those written by journalists, but overall the ratings were satisfactory (avg. 2.9 vs. 4.0 for journalists on a 5-point scale). Valtteri´s best rating (3.6) was for credibility. The computer-written articles that the evaluators could freely select got slightly better ratings than the preselected computer-written articles. When looking at the results by demographic groups, males aged 55 or more liked the automatic articles best and females aged 34 or less liked them the least. Evaluators mistook 21% of the computer-written articles as written by humans, and 10% of the human-written articles as computer-written. The share of users making these mistakes grew with the age. Overall, male evaluators made less writer-identification mistakes than female evaluators did.

KW - Artificial intelligence

KW - Automated content generation

KW - Automated storytelling

KW - Natural language processing

KW - Robot journalism

UR - http://www.scopus.com/inward/record.url?scp=85050960136&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2018.2861987

DO - 10.1109/ACCESS.2018.2861987

M3 - Article

VL - 6

SP - 43356

EP - 43367

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 8424161

ER -