Abstract
In this article we discuss automated preprocessing of
environmental data for further use. Environmental data is
by default heterogeneous, as it may consist of data from
sources such as weather stations, weather radars,
chemical sensors, acoustic sensors, and off-line
laboratory analysis. When integrating data from such
heterogeneous sources, it needs to be processed in a
context dependent manner. In addition, there is no single
generic processing method; rather, several atomic methods
need to be applied and in an appropriate sequence.
Furthermore, the problem is complicated by the
requirements set by the intended use of the data. The
requirements influence not only the set of applicable
methods but also the application sequence. In this
article, we study automation of the selection and
sequencing of preprocessing methods based on the user
requirements. As the main contribution, we propose here
the use of characterizations and a reachability algorithm
to solve the selection and sequencing problem. In this
article, we present the algorithm and argue for its
correctness. We also discuss, how the algorithm is
implemented as a cloud service, and illustrate the use of
the service with simple case studies.
Original language | English |
---|---|
Pages (from-to) | 13 - 24 |
Journal | Future Generation Computer Systems |
Volume | 45 |
DOIs | |
Publication status | Published - 2015 |
MoE publication type | A1 Journal article-refereed |
Keywords
- environmental informatics
- workflows
- data preprocessing
- reachability analysis
- formal methods