Abstract
The challenge of data integration is not new, but has been ballooned due the increase in the use of data in different applications. Such applications often require information from different sources, who provide data that can be heterogeneous in its structure or content. Issues are exacerbated when the data is text-based, since many traditional data integration and analysis strategies cannot fully utilize the semantic meaning within the text, and lack the ability to understand context-clues. The rise of artificial intelligence (AI) offers a potential solution to one data integration task; entity matching; with the help of large language models.
This thesis examines the use of large language models in entity matching in the mobility domain, using data from two distinct sources: public transportation and geospatial data of the road network and buildings. Our findings show, that large language models have potential as an entity matching tool, and can accurately be used as a classifier, even when data is originating from different data models or ontologies.
With our work, we offer a new application of large language models in a domain, that is not well represented in the research, while showing potential for future expansion of our algorithm and its use cases.
This thesis examines the use of large language models in entity matching in the mobility domain, using data from two distinct sources: public transportation and geospatial data of the road network and buildings. Our findings show, that large language models have potential as an entity matching tool, and can accurately be used as a classifier, even when data is originating from different data models or ontologies.
With our work, we offer a new application of large language models in a domain, that is not well represented in the research, while showing potential for future expansion of our algorithm and its use cases.
Original language | English |
---|---|
Qualification | Master Degree |
Awarding Institution |
|
Supervisors/Advisors |
|
Thesis sponsors | |
Award date | 31 Dec 2024 |
Publisher | |
Publication status | Published - 10 Dec 2024 |
MoE publication type | G2 Master's thesis, polytechnic Master's thesis |