Automated preprocessing of environmental data

M. Rönkkö, J. Heikkinen, Ville Kotovirta

Research output: Contribution to journalArticleScientificpeer-review

6 Citations (Scopus)

Abstract

In this article we discuss automated preprocessing of environmental data for further use. Environmental data is by default heterogeneous, as it may consist of data from sources such as weather stations, weather radars, chemical sensors, acoustic sensors, and off-line laboratory analysis. When integrating data from such heterogeneous sources, it needs to be processed in a context dependent manner. In addition, there is no single generic processing method; rather, several atomic methods need to be applied and in an appropriate sequence. Furthermore, the problem is complicated by the requirements set by the intended use of the data. The requirements influence not only the set of applicable methods but also the application sequence. In this article, we study automation of the selection and sequencing of preprocessing methods based on the user requirements. As the main contribution, we propose here the use of characterizations and a reachability algorithm to solve the selection and sequencing problem. In this article, we present the algorithm and argue for its correctness. We also discuss, how the algorithm is implemented as a cloud service, and illustrate the use of the service with simple case studies.
Original languageEnglish
Pages (from-to)13 - 24
JournalFuture Generation Computer Systems
Volume45
DOIs
Publication statusPublished - 2015
MoE publication typeA1 Journal article-refereed

Fingerprint

Chemical sensors
Automation
Acoustics
Sensors
Processing

Keywords

  • environmental informatics
  • workflows
  • data preprocessing
  • reachability analysis
  • formal methods

Cite this

Rönkkö, M. ; Heikkinen, J. ; Kotovirta, Ville. / Automated preprocessing of environmental data. In: Future Generation Computer Systems. 2015 ; Vol. 45. pp. 13 - 24.
@article{ecec86b5b3394030b4484c86980dfeda,
title = "Automated preprocessing of environmental data",
abstract = "In this article we discuss automated preprocessing of environmental data for further use. Environmental data is by default heterogeneous, as it may consist of data from sources such as weather stations, weather radars, chemical sensors, acoustic sensors, and off-line laboratory analysis. When integrating data from such heterogeneous sources, it needs to be processed in a context dependent manner. In addition, there is no single generic processing method; rather, several atomic methods need to be applied and in an appropriate sequence. Furthermore, the problem is complicated by the requirements set by the intended use of the data. The requirements influence not only the set of applicable methods but also the application sequence. In this article, we study automation of the selection and sequencing of preprocessing methods based on the user requirements. As the main contribution, we propose here the use of characterizations and a reachability algorithm to solve the selection and sequencing problem. In this article, we present the algorithm and argue for its correctness. We also discuss, how the algorithm is implemented as a cloud service, and illustrate the use of the service with simple case studies.",
keywords = "environmental informatics, workflows, data preprocessing, reachability analysis, formal methods",
author = "M. R{\"o}nkk{\"o} and J. Heikkinen and Ville Kotovirta",
year = "2015",
doi = "10.1016/j.future.2014.10.011",
language = "English",
volume = "45",
pages = "13 -- 24",
journal = "Future Generation Computer Systems",
issn = "0167-739X",
publisher = "Elsevier",

}

Automated preprocessing of environmental data. / Rönkkö, M.; Heikkinen, J.; Kotovirta, Ville.

In: Future Generation Computer Systems, Vol. 45, 2015, p. 13 - 24.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Automated preprocessing of environmental data

AU - Rönkkö, M.

AU - Heikkinen, J.

AU - Kotovirta, Ville

PY - 2015

Y1 - 2015

N2 - In this article we discuss automated preprocessing of environmental data for further use. Environmental data is by default heterogeneous, as it may consist of data from sources such as weather stations, weather radars, chemical sensors, acoustic sensors, and off-line laboratory analysis. When integrating data from such heterogeneous sources, it needs to be processed in a context dependent manner. In addition, there is no single generic processing method; rather, several atomic methods need to be applied and in an appropriate sequence. Furthermore, the problem is complicated by the requirements set by the intended use of the data. The requirements influence not only the set of applicable methods but also the application sequence. In this article, we study automation of the selection and sequencing of preprocessing methods based on the user requirements. As the main contribution, we propose here the use of characterizations and a reachability algorithm to solve the selection and sequencing problem. In this article, we present the algorithm and argue for its correctness. We also discuss, how the algorithm is implemented as a cloud service, and illustrate the use of the service with simple case studies.

AB - In this article we discuss automated preprocessing of environmental data for further use. Environmental data is by default heterogeneous, as it may consist of data from sources such as weather stations, weather radars, chemical sensors, acoustic sensors, and off-line laboratory analysis. When integrating data from such heterogeneous sources, it needs to be processed in a context dependent manner. In addition, there is no single generic processing method; rather, several atomic methods need to be applied and in an appropriate sequence. Furthermore, the problem is complicated by the requirements set by the intended use of the data. The requirements influence not only the set of applicable methods but also the application sequence. In this article, we study automation of the selection and sequencing of preprocessing methods based on the user requirements. As the main contribution, we propose here the use of characterizations and a reachability algorithm to solve the selection and sequencing problem. In this article, we present the algorithm and argue for its correctness. We also discuss, how the algorithm is implemented as a cloud service, and illustrate the use of the service with simple case studies.

KW - environmental informatics

KW - workflows

KW - data preprocessing

KW - reachability analysis

KW - formal methods

U2 - 10.1016/j.future.2014.10.011

DO - 10.1016/j.future.2014.10.011

M3 - Article

VL - 45

SP - 13

EP - 24

JO - Future Generation Computer Systems

JF - Future Generation Computer Systems

SN - 0167-739X

ER -