Abstract
Poor quality or noisy annotations in Named Entity Recognition (NER), as in any other NLP task, make it challenging to achieve state-of-the-art performance. In this paper, we present a multi-step framework to enhance the annotation quality of NER datasets by employing automated techniques. We propose a frequency-based iterative approach that leverages self-training and a dual-threshold mechanism to enhance inference confidence. Experimental evaluations on different NER datasets demonstrate significant improvements in NER performance with respect to the original datasets. This work further explores the potential of generative Large Language Models (LLMs) to perform NER for low-resource languages.
| Original language | English |
|---|---|
| Title of host publication | Findings of the Association for Computational Linguistics: EACL 2026 |
| Editors | V. Demberg, K. Inui, L. Marquez |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 4138-4151 |
| Number of pages | 14 |
| ISBN (Electronic) | 979-8-89176-386-9 |
| DOIs | |
| Publication status | Published - 2026 |
| MoE publication type | A4 Article in a conference publication |
| Event | 19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026 - Rabat, Morocco Duration: 24 Mar 2026 → 29 Mar 2026 |
Conference
| Conference | 19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026 |
|---|---|
| Country/Territory | Morocco |
| City | Rabat |
| Period | 24/03/26 → 29/03/26 |
Fingerprint
Dive into the research topics of 'A Scalable Framework for Automated NER Annotation Correction in Low-Resource Languages'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver