The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis

Johannes Hiller, Erik Svanberg, Sami Koskinen, Francesco Bellotti, Nisrine Osman

Research output: Contribution to conferenceConference articleScientific

Abstract

Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA "V" evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 %, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.
Original languageEnglish
Number of pages8
Publication statusPublished - Jun 2019
MoE publication typeNot Eligible
Event26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV) - Eindhoven, Netherlands, Eindhoven, Netherlands
Duration: 10 Jun 201913 Jun 2019
https://www.esv2019.com/

Conference

Conference26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV)
Abbreviated titleESV2019
CountryNetherlands
CityEindhoven
Period10/06/1913/06/19
Internet address

Fingerprint

Time series
MATLAB
Artificial intelligence
Data structures
Railroad cars
Automation
Neural networks
Sensors
Processing

Cite this

Hiller, J., Svanberg, E., Koskinen, S., Bellotti, F., & Osman, N. (2019). The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.
Hiller, Johannes ; Svanberg, Erik ; Koskinen, Sami ; Bellotti, Francesco ; Osman, Nisrine. / The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.8 p.
@conference{d89200c7e3444a9295892d4736740886,
title = "The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis",
abstract = "Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA {"}V{"} evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 {\%}, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.",
author = "Johannes Hiller and Erik Svanberg and Sami Koskinen and Francesco Bellotti and Nisrine Osman",
year = "2019",
month = "6",
language = "English",
note = "26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), ESV2019 ; Conference date: 10-06-2019 Through 13-06-2019",
url = "https://www.esv2019.com/",

}

Hiller, J, Svanberg, E, Koskinen, S, Bellotti, F & Osman, N 2019, 'The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis' Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands, 10/06/19 - 13/06/19, .

The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. / Hiller, Johannes; Svanberg, Erik; Koskinen, Sami; Bellotti, Francesco; Osman, Nisrine.

2019. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.

Research output: Contribution to conferenceConference articleScientific

TY - CONF

T1 - The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis

AU - Hiller, Johannes

AU - Svanberg, Erik

AU - Koskinen, Sami

AU - Bellotti, Francesco

AU - Osman, Nisrine

PY - 2019/6

Y1 - 2019/6

N2 - Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA "V" evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 %, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.

AB - Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA "V" evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 %, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.

M3 - Conference article

ER -

Hiller J, Svanberg E, Koskinen S, Bellotti F, Osman N. The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. 2019. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.