Attention-CNN with Multi-Task Learning for Chinese Named Entity Recognition

Yanhong Fu, Fuwang Chen

Abstract


Named entity recognition serves as a cornerstone in natural language processing and has garnered extensive research attention due to its significance in various downstream applications. Owing to the intricate nature of Chinese texts, characterized by complex syntactic structures and the lack of explicit word boundaries, conventional NER methodologies often encounter difficulties in simultaneously optimizing recognition accuracy and computational efficiency. To address this issue, the study proposes a named entity recognition algorithm that integrates attention mechanisms with Convolutional Neural Networks, incorporates into a Transformer-based bidirectional encoder framework for training. A multi-head self-attention mechanism is employed to capture the global semantic information of the text, and multi-task learning is introduced to construct the final model. When evaluated on datasets with sample sizes of 200, 1000, and 3000, the proposed model consistently outperforms the baseline models in terms of precision, recall, and F1 score. Specifically, under the low-resource setting with 200 samples, the model achieves a precision of 98.62%, a recall of 98.10%, and an F1 score of 98.36%. In terms of inference efficiency, the model processes at a speed of 2618 tokens per second. The experimental results indicate that this method can be widely applied in various fields such as information extraction and text understanding, providing strong technical support for related research.


Full Text:

PDF

References


Abdullah M H A, Aziz N, Abdulkadir S J, Alhussian H S A, Talpur N. Systematic literature review of information extraction from textual data: recent methods, applications, trends, and challenges. IEEE Access, 2023, 11(1): 10535-10562.

Pan X, Xue Y. Advancements of Artificial Intelligence Techniques in the Realm About Library and Information Subject—A Case Survey of Latent Dirichlet Allocation Method. IEEE Access, 2023, 11(2): 132627-132640.

Shishehgarkhaneh M B, Moehler R C, Fang Y, Hijazi A A, Aboutorab H. Transformer-Based Named Entity Recognition in Construction Supply Chain Risk Management in Australia. IEEE Access, 2024, 12(3): 41829-41851.

Almutiri T, Nadeem F. Markov models applications in natural language processing: a survey. Int. J. Inf. Technol. Comput. Sci, 2022, 2(1): 1-16.

Eker K, Pehlivanoğlu M K, Eker A G, Syakura M A, Duru N. A Comparison of Grammatical Error Correction Models in English Writing. IEEE Access, 2023, 56(13): 218-223.

Yu Z, Shi X, Zhang Z. A multi-head self-attention transformer-based model for traffic situation prediction in terminal areas. IEEE Access, 2023, 11(7): 16156-16165.

Xiong W. Web News Media Retrieval Analysis Integrating with Knowledge Recognition of Semantic Grouping Vector Space Model. Informatica, 2024, 48(5): 41-54.

Peng F, McCallum A. Information extraction from research papers using conditional random fields. Information processing & management, 2006, 42(4): 963-979.

Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991, 2015.

Devlin J, Chang M W, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding//Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 2019: 4171-4186.

Chayintr T, Kamigaito H, Funakoshi K, Okumura M. Latte: Lattice attentive encoding for character-based word segmentation. Journal of Natural Language Processing, 2023, 30(2): 456-488.

Patel M, Ezeife C I. BERT-based multi-task learning for aspect-based opinion mining//International conference on database and expert systems applications. Cham: Springer International Publishing, 2021: 192-204.

Zhai Z, Fan R, Huang J, Xiong, Zhang L, Wan J, Zhang L. A Named Entity Recognition Method based on Knowledge Distillation and Efficient GlobalPointer for Chinese Medical Texts. IEEE Access, 2024, 12(4): 83563-83574.

Ranjan R, Daniel A K. CoBiAt: A Sentiment ClassificatiCobiat: A Sentiment Classification Model Using Hybrid Convnet-Dual-lstm with Attention Mechanismon Model using Hybrid ConvNet-Dual-LSTM with Attention Mechanism. Informatica, 2023, 47(4): 523-536.

Yang C, Sheng L, Wei Z, Wang W. Chinese named entity recognition of epidemiological investigation of information on COVID-19 based on BERT. IEEE Access, 2022, 10(3): 104156-104168.

Zhao P, Wang W, Liu H, Han M. Recognition of the agricultural named entities with multifeature fusion based on albert. IEEE Access, 2022, 10(9): 98936-98943.

Gonçalves T, Rio-Torto I, Teixeira L F, Cardoso J S. A survey on attention mechanisms for medical applications: are we moving toward better Algorithms? IEEE Access, 2022, 10(7): 98909-98935.

Yang Z, Ma J, Chen H, Zhang J, Chang Y. Context-aware attentive multilevel feature fusion for named entity recognition. IEEE transactions on neural networks and learning systems, 2022, 35(1): 973-984.

Biswas S, Poornalatha G. Opinion Mining Using Multi-Dimensional Analysis. IEEE Access, 2023, 11(5): 25906-25916.

Haque M Z, Zaman S, Saurav J R, Haque S, Islam M S, Amin M R. B-ner: A novel bangla named entity recognition dataset with largest entities and its baseline evaluation. IEEE Access, 2023, 11(1): 45194-45205.

Liu Y, Wen F, Zong T, Li T. Research on joint extraction method of entity and relation triples based on hierarchical cascade labeling. IEEE Access, 2022, 11(3): 9789-9798.

Rafi T H, Ko Y W. HeartNet: Self multihead attention mechanism via convolutional network with adversarial data synthesis for ECG-based arrhythmia classification. IEEE Access, 2022, 10(7): 100501-100512.

Chen X, Cong P, Lv S. A long-text classification method of Chinese news based on BERT and CNN. IEEE Access, 2022, 10(5): 34046-34057.

Shen Y, Liu Q, Fan Z, Liu J, Wumaier A. Self-supervised pre-trained speech representation based end-to-end mispronunciation detection and diagnosis of Mandarin. IEEE Access, 2022, 10(6): 106451-106462.




DOI: https://doi.org/10.31449/inf.v49i7.8344

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.