A Hybrid Method for Low-Resource Named Entity Recognition
Abstract
Named Entity Recognition (NER) is a critical component of Natural Language Processing with diverse applications in information extraction and conversational AI. However, NER in specific domains for low-resource languages faces challenges such as limited annotated data and heterogeneous label sets. This study addresses these issues by proposing a hybrid neurosymbolic framework that integrates rule-based processing with deep learning models for Vietnamese NER. The core idea involves a two-stage pipeline: first, a rule-based component reduces label complexity by grouping relational and special categories; second, pre-trained language models are fine-tuned for high-precision extraction. A post-processing module is then utilized to restore fine-grained labels, preserving expressiveness for application-level usability. To mitigate data scarcity, a scalable data augmentation strategy leveraging Large Language Models (LLMs) is introduced to expand the label set without full re-annotation—a significant novelty of this work. The effectiveness of this method was evaluated across five specific-domain datasets, including logistics, wildlife, and healthcare. Experimental results demonstrate substantial improvements over strong RoBERTa-based baselines. Specifically, the proposed system achieved F1 scores of 90% in Customer Service (up from 83%), 84% in GAM (up from 73%), 83% in AI Fluent (up from 80%), 94% in PhoNER_Covid19 (up from 91%), and 60% in Rare Wildlife (up from 36%). These findings confirm that the hybrid approach effectively captures the linguistic complexity of Vietnamese and contextual nuances in specialized domains, offering a robust contribution to low-resource NER research.
Article Metrics
Abstract: 11 Viewers PDF: 7 ViewersKeywords
Named Entity Recognition; Hybrid Model; Deep Learning; Rule-based System; Information Extraction
Full Text:
PDF
DOI:
https://doi.org/10.47738/jads.v7i2.1161
Citation Analysis:
Refbacks
- There are currently no refbacks.
Journal of Applied Data Sciences
| ISSN | : | 2723-6471 (Online) |
| Collaborated with | : | Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia. |
| Publisher | : | Bright Publisher |
| Website | : | http://bright-journal.org/JADS |
| : | taqwa@amikompurwokerto.ac.id (principal contact) | |
| support@bright-journal.org (technical issues) |
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0




.png)