The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis

Johannes Hiller, Erik Svanberg, Sami Koskinen, Francesco Bellotti, Nisrine Osman

    Research output: Contribution to conferenceConference articleScientific

    Abstract

    Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA "V" evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 %, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.
    Original languageEnglish
    Number of pages8
    Publication statusPublished - Jun 2019
    MoE publication typeNot Eligible
    Event26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV) - Eindhoven, Netherlands, Eindhoven, Netherlands
    Duration: 10 Jun 201913 Jun 2019
    https://www.esv2019.com/

    Conference

    Conference26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV)
    Abbreviated titleESV2019
    CountryNetherlands
    CityEindhoven
    Period10/06/1913/06/19
    Internet address

    Fingerprint

    Time series
    MATLAB
    Artificial intelligence
    Data structures
    Railroad cars
    Automation
    Neural networks
    Sensors
    Processing

    Cite this

    Hiller, J., Svanberg, E., Koskinen, S., Bellotti, F., & Osman, N. (2019). The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.
    Hiller, Johannes ; Svanberg, Erik ; Koskinen, Sami ; Bellotti, Francesco ; Osman, Nisrine. / The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.8 p.
    @conference{d89200c7e3444a9295892d4736740886,
    title = "The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis",
    abstract = "Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA {"}V{"} evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 {\%}, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.",
    author = "Johannes Hiller and Erik Svanberg and Sami Koskinen and Francesco Bellotti and Nisrine Osman",
    year = "2019",
    month = "6",
    language = "English",
    note = "26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), ESV2019 ; Conference date: 10-06-2019 Through 13-06-2019",
    url = "https://www.esv2019.com/",

    }

    Hiller, J, Svanberg, E, Koskinen, S, Bellotti, F & Osman, N 2019, 'The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis', Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands, 10/06/19 - 13/06/19.

    The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. / Hiller, Johannes; Svanberg, Erik; Koskinen, Sami; Bellotti, Francesco; Osman, Nisrine.

    2019. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.

    Research output: Contribution to conferenceConference articleScientific

    TY - CONF

    T1 - The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis

    AU - Hiller, Johannes

    AU - Svanberg, Erik

    AU - Koskinen, Sami

    AU - Bellotti, Francesco

    AU - Osman, Nisrine

    PY - 2019/6

    Y1 - 2019/6

    N2 - Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA "V" evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 %, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.

    AB - Analyzing road-test data is important for developing automated vehicles. L3Pilot is a European pilot project on level 3 automation, including 34 partners among manufacturers, suppliers and research institutions. Targeting around 100 cars and 1000 test subjects, the project will generate large amounts of data. We present a data format, allowing efficient data collection, handling and analysis by multiple organizations. A project of the scope of L3Pilot involves various challenges. Data come from a multitude of heterogeneous sources and are processed by a variety of tools. Recorded data span all data types generated in various vehicular sensors/systems and are enriched with external data sources. Videos supplement time-series data as external files. Derived measures and performance indicators-required to answer research questions about effectiveness of automated driving-are processed by analysis partners and included for each test session. As a file format, we chose HDF5, which offers a data model and software libraries for storing and managing data. HDF5 is designed for flexible and efficient I/O and for high volume and complex data. The usage of different computing environments for specific tasks is facilitated by the portability that comes with the format. Portability is also important for exploiting the rising potential within artificial intelligence (e.g. automatic scene detection and video annotation). Based on lessons learned from past field tests, we defined a general frame for the common data format that is aligned with the data processing steps of FESTA "V" evaluation methodology. The definitions include representation of the source signals and a hierarchical structure for including multiple datasets that are gradually supplemented (post-processed or annotated) during the various analysis steps. By using the HDF5 format, analysis partners have the freedom to exploit their familiar tools: MATLAB, Java, Python, R, etc. First comparisons between time-series data in previous projects (e.g. AdaptIVe) and the proposed data format show a reduction in storage size of around 80 %, without losses in performance. Much of that is due to efficient internal compression and structuring of data. Considering the amount of objective data involved in automated driving, this leads to a great benefit, in terms of usability. This paper presents a compact, portable, and extensible format aimed at handling extremely large amounts of field test data collected in automated driving pilots. As a harmonized format between tens of organizations performing tests in the L3Pilot project, the proposed format has the potential to promote data sharing as well as development of common tools and gain popularity for use in other projects. The format is designed to allow efficient storing of data and its iterative processing with analysis and evaluation tools. The format also considers the requirements of AI tools supporting neural network training and use.

    M3 - Conference article

    ER -

    Hiller J, Svanberg E, Koskinen S, Bellotti F, Osman N. The L3Pilot Common Data Format - Enabling Efficient Automated Driving Data Analysis. 2019. Paper presented at 26th International Technical Conference and exhibition on the Enhanced Safety of Vehicles (ESV), Eindhoven, Netherlands.