TY - CHAP
T1 - Object detection in design diagrams with machine learning
AU - Nurminen, Jukka K.
AU - Rainio, Kari
AU - Numminen, Jukka Pekka
AU - Syrjänen, Timo
AU - Paganus, Niklas
AU - Honkoila, Karri
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Over the years companies have accumulated large amounts of legacy data. With modern data mining and machine learning techniques the data is increasingly valuable. Therefore being able to convert legacy data into a computer understandable form is important. In this work, we investigate how to convert schematic diagrams, such as process and instrumentation diagrams (P&I diagrams). We use modern machine learning based approaches, in particular, the Yolo neural network system, to detect high-level objects, e.g. pumps or valves, in diagrams which are scanned from paper archives or stored in pixel or vector form. Together with connection detection and OCR this is an essential step for the reuse of old planning data. Our results show that Yolo, as an instance of modern machine learning based object detection systems, works well with schematic diagrams. In our concept, we use a simulator to automatically generate labeled training material to the system. We then retrain a previously trained network to detect the components of our interest. Detection of large components is accurate but small components with sizes below 15% of page size are missed. However, this can be worked around by dividing a big diagram into a set of smaller subdiagrams with different scales, processing them separately, and combining the results.
AB - Over the years companies have accumulated large amounts of legacy data. With modern data mining and machine learning techniques the data is increasingly valuable. Therefore being able to convert legacy data into a computer understandable form is important. In this work, we investigate how to convert schematic diagrams, such as process and instrumentation diagrams (P&I diagrams). We use modern machine learning based approaches, in particular, the Yolo neural network system, to detect high-level objects, e.g. pumps or valves, in diagrams which are scanned from paper archives or stored in pixel or vector form. Together with connection detection and OCR this is an essential step for the reuse of old planning data. Our results show that Yolo, as an instance of modern machine learning based object detection systems, works well with schematic diagrams. In our concept, we use a simulator to automatically generate labeled training material to the system. We then retrain a previously trained network to detect the components of our interest. Detection of large components is accurate but small components with sizes below 15% of page size are missed. However, this can be worked around by dividing a big diagram into a set of smaller subdiagrams with different scales, processing them separately, and combining the results.
KW - Legacy data
KW - Machine learning
KW - Object detection
KW - Schematic diagrams
UR - http://www.scopus.com/inward/record.url?scp=85065814159&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-19738-4_4
DO - 10.1007/978-3-030-19738-4_4
M3 - Chapter or book article
AN - SCOPUS:85065814159
SN - 978-3-030-19737-7
T3 - Advances in Intelligent Systems and Computing
SP - 27
EP - 36
BT - CORES 2019
PB - Springer
T2 - International Conference on Computer Recognition Systems, CORES 2019
Y2 - 20 May 2020 through 22 May 2020
ER -