An Automatic Labeling Method for Subword-Phrase Recognition in Effective Text Classification

Yusuke Kimura; Takahiro Komamizu; Kenji Hatano

doi:10.31449/inf.v47i3.4742

An Automatic Labeling Method for Subword-Phrase Recognition in Effective Text Classification

Yusuke Kimura, Takahiro Komamizu, Kenji Hatano

Abstract

Text classification methods using deep learning, which is trained with a tremendous amount of text, have achieved superior performance than traditional methods. In addition to its success, multi-task learning (MTL for short) has become a promising approach for text classification; for instance, a multi-task learning approach employs the named entity recognition as an auxiliary task for text classification, and it showcases that the auxiliary task helps make the text classification model higher classification performance. The existing MTL-based text classification methods depend on auxiliary tasks using supervised labels. Obtaining such supervision signals requires additional human and financial costs in addition to those for the main text classification task. To reduce these costs, this paper proposes a multi-task learning-based text classification framework reducing the additional costs on supervised label creation by automatically labeling phrases in texts for the auxiliary recognition task. A basic idea to realize the proposed framework is to utilize phrasal expressions consisting of subwords (called subword-phrase) and to deal with the recent situation in which the pre-trained neural language models such as BERT are designed upon subword-based tokenization to avoid out-of-vocabulary words being missed. To the best of our knowledge, there has been no text classification approach on top of subword-phrases, because subwords only sometimes express a coherent set of meanings. The proposed framework is novel in adding subword-phrase recognition as an auxiliary task and utilizing subword-phrases for text classification. It extracts subword-phrases in an unsupervised manner, particularly the statistics approach. In order to construct labels for effective subword-phrase recognition tasks, extracted subword-phrases are classified for document classes so that subword-phrases dedicated to some classes can be distinguishable. The experimental evaluation of the five popular datasets for text classification showcases the effectiveness of the involvement of the subword-phrase recognition as an auxiliary task. It also shows comparative results with the state-of-the-art method, and the comparison of various labeling schemes indicates insights for labeling common subword-phrases among several document classes.

Full Text:

PDF

References

C. Apté, F. Damerau, and S. M. Weiss. Au- tomated Learning of Decision Rules for Text Categorization. ACM Transactions on Infor- mation Systems, 12(3):233–251, 1994.

A. Benayas, R. Hashempour, D. Rumble, S. Jameel, and R. C. De Amorim. Uni- fied Transformer Multi-Task Learning for In- tent Classification With Entity Recognition. IEEE Access, 9:147306–147314, 2021.

Q. Bi, J. Li, L. Shang, X. Jiang, Q. Liu, and H. Yang. MTRec: Multi-Task Learning over BERT for News Recommendation. In Find- ings of the Association for Computational Linguistics: ACL 2022, pages 2663–2669, May 2022.

J. Blitzer, M. Dredze, and F. Pereira. Biogra- phies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classifi- cation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 440–447, 2007.

T. B. Brown, B. Mann, N. Ryder, M. Sub- biah, J. Kaplan, P. Dhariwal, A. Nee- lakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. Mc- Candlish, A. Radford, I. Sutskever, and D. Amodei. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, 2020.

R. Caruana. Multitask Learning. Machine Learning, 28(1):41–75, 1997.

O. de Gibert, N. Pérez, A. G. Pablos, and M. Cuadros. Hate Speech Dataset from a White Supremacy Forum. In Proceedings of the 2nd Workshop on Abusive Language On- line (ALW2), pages 11–20, 2018.

J. Devlin, M. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Un- derstanding. In Proceedings of the 2019 Conference of the North American Chap- ter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 2 (Industry Papers), pages 4171–4186, 2019.

S. Graham, Q. D. Vu, M. Jahanifar, S. Raza, F. A. Afsar, D. R. J. Snead, and N. M. Ra- jpoot. One model is all you need: Multi-task learning enables simultaneous histology im- age segmentation and classification. Medical Image Analysis, 83:102685, 2023.

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In 3rd Interna- tional Conference on Learning Representa- tions, 2015.

C. Li, J. Hu, T. Li, S. Du, and F. Teng. An ef- fective multi-task learning model for end-to- end emotion-cause pair extraction. Applied Intelligence, 53(3):3519–3529, 2023.

Q. Li, H. Peng, J. Li, C. Xia, R. Yang, L. Sun, P. S. Yu, and L. He. A Survey on Text Classification: From Traditional to Deep Learning. ACM Transactions on Intelligent Systems and Technology, 13(2):31:1–31:41, 2022.

X. Li and D. Roth. Learning Question Clas- sifiers. In COLING 2002: The 19th Inter- national Conference on Computational Lin- guistics, 2002.

Y. Lin, Y. Meng, X. Sun, Q. Han, K. Kuang, J. Li, and F. Wu. BertGCN: Transductive Text Classification by Combining GNN and BERT. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1456–1462, Online, Aug. 2021.

M. Lippi, P. Palka, G. Contissa, F. La- gioia, H. Micklitz, G. Sartor, and P. Tor- roni. CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. Artifcial Intelligence and Law, 27(2):117–139, 2019.

P. Liu, X. Qiu, and X. Huang. Deep Multi- Task Learning with Shared Memory for Text Classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 118–127, 2016.

P. Liu, X. Qiu, and X. Huang. Recurrent Neural Network for Text Classification with Multi-Task Learning. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 2873–2879, 2016.

X. Liu, J. Gao, X. He, L. Deng, K. Duh, and Y. Wang. Representation Learning Us- ing Multi-Task Deep Neural Networks for Semantic Classification and Information Re- trieval. In Proceedings of the 2015 Conference of the North American Chapter of the Associ- ation for Computational Linguistics: Human Language Technologies, pages 912–921, 2015.

Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettle- moyer, and V. Stoyanov. RoBERTa: A Robustly Optimized BERT Pretraining Ap- proach. CoRR, abs/1907.11692, 2019.

I. Loshchilov and F. Hutter. Decoupled Weight Decay Regularization. In 7th Inter- national Conference on Learning Representa- tions, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. Learn- ing word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguis- tics: Human Language Technologies, pages 142–150. Association for Computational Lin- guistics, 2011.

Y. Mao, Z. Wang, W. Liu, X. Lin, and P. Xie. MetaWeighting: Learning to Weight Tasks in Multi-Task Learning. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3436–3448, 2022.

S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao. Deep Learning-based Text Classification: A Com- prehensive Review. ACM Computing Sur- veys, 54(3):62:1–62:40, 2021.

R. Qi, M. Yang, Y. Jian, Z. Li, and H. Chen. A Local context focus learning model for joint multi-task using syntactic de- pendency relative distance. Applied Intelli- gence, 53(4):4145–4161, 2023.

A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever. Zero-Shot Text-to-Image Gen- eration. In Proceedings of the 38th Inter- national Conference on Machine Learning, pages 8821–8831, 2021.

L. Ramshaw and M. Marcus. Text Chunk- ing using Transformation-Based Learning. In Third Workshop on Very Large Corpora, 1995.

F. Sebastiani. Machine Learning in Auto- mated Text Categorization. ACM computing surveys, 34(1):1–47, mar 2002.

O. Sener and V. Koltun. Multi-Task Learn- ing as Multi-Objective Optimization. In Pro- ceedings of the 32nd International Confer- ence on Neural Information Processing Sys- tems, pages 525–536, 2018.

R. Sennrich, B. Haddow, and A. Birch. Neu- ral Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Com- putational Linguistics (Volume 1: Long Pa- pers), pages 1715–1725, Berlin, Germany, Aug. 2016. Association for Computational Linguistics.

R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts. Re- cursive Deep Models for Semantic Composi- tionality Over a Sentiment Treebank. In Pro- ceedings of the 2013 Conference on Empiri- cal Methods in Natural Language Processing, pages 1631–1642, 2013.

T. Tohti, M. Abdurxit, and A. Hamdulla. Medical QA Oriented Multi-Task Learning Model for Question Intent Classification and Named Entity Recognition. Information, 13(12):581, 2022.

C. Wu, G. Luo, C. Guo, Y. Ren, A. Zheng, and C. Yang. An attention-based multi-task model for named entity recognition and in- tent analysis of Chinese online medical ques- tions. Journal of Biomedical Informatics, 108:103511, 2020.

M. Xu, K. Huang, and X. Qi. A Regional- Attentive Multi-Task Learning Framework for Breast Ultrasound Image Segmenta- tion and Classification. IEEE Access, 11:5377–5392, 2023.

H. Yang, B. Zeng, J. Yang, Y. Song, and R. Xu. A multi-task learning model for Chinese-oriented aspect polarity classifica- tion and aspect term extraction. Neurocom- puting, 419:344–356, 2021.

L. Yao, C. Mao, and Y. Luo. Graph Convo- lutional Networks for Text Classification. In Proceedings of the Thirty-Third AAAI Con- ference on Artificial Intelligence and Thirty- First Innovative Applications of Artificial In- telligence Conference and Ninth AAAI Sym- posium on Educational Advances in Artificial Intelligence, pages 7370–7377, 2019.

H. Zhang, L. Xiao, Y. Wang, and Y. Jin. A Generalized Recurrent Neural Architec- ture for Text Classification with Multi-Task Learning. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pages 3385–3391, 2017.

X. Zhang, Q. Zhang, Z. Yan, R. Liu, and Y. Cao. Enhancing Label Correlation Feed- back in Multi-Label Text Classification via Multi-Task Learning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1190–1200. Asso- ciation for Computational Linguistics, 2021.

Y. Zhang and Q. Yang. A Survey on Multi-Task Learning. IEEE Transac- tions on Knowledge and Data Engineering, 34(12):5586–5609, 2022.

Y. Zhang, N. Zincir-Heywood, and E. Mil- ios. Narrative Text Classification for Auto- matic Key Phrase Extraction in Web Docu- ment Corpora. In Proceedings of the 7th An- nual ACM International Workshop on Web Information and Data Management, WIDM ’05, page 51–58, 2005.

Z. Zhang, W. Yu, M. Yu, Z. Guo, and M. Jiang. A Survey of Multi-task Learn- ing in Natural Language Processing: Regard- ing Task Relatedness and Training Methods. CoRR, abs/2204.03508, 2022.

M. Zhao, J. Yang, and L. Qu. A multi- task learning model with graph convolutional networks for aspect term extraction and po- larity classification. Applied Intelligence, 53(6):6585–6603, 2023.

DOI: https://doi.org/10.31449/inf.v47i3.4742

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me