Clause-Level Subjective Classification for Thai Article Using Bidirectional Long Short-Term Memory

Main Article Content

Nutdanai Sritiparkorn
Songsakdi Rongviriyapanish

Abstract

Sentence subjective classification is one of the crucial steps in analyzing opinions from such data as articles or online media which the volume has increased greatly. Extracted opinions from sentences can be used as information to produce or improve products. This research presented a method to create a model for classifying opinion at the clause level in Thai language articles using a Bidirectional Long Short-Term Memory (BiLSTM) deep learning model. This model is widely used to deal with sequential data. Moreover, the FastText model was used to convert words into numerical vectors. Our research experimented by creating models from texts in multi-domain and measuring the accuracy of the classification using the LST20 dataset. This dataset contains 44,423 pre-segmented clauses, including Part of Speech and Named Entity annotations, which are used as features for model learning. The evaluation of model performance used 5-fold cross-validation. We found that the BiLSTM model using 200 neurons in the Long Short-Term Memory unit with word and Part of Speech as features is the best model. It achieved precision of 62.562%, recall of 51.151%, accuracy score of 79.407%, and F1-score of 56.284%.

Article Details

How to Cite
Sritiparkorn, N., & Rongviriyapanish, S. (2023). Clause-Level Subjective Classification for Thai Article Using Bidirectional Long Short-Term Memory. Journal of Science Ladkrabang, 32(2), 1–16. Retrieved from https://li01.tci-thaijo.org/index.php/science_kmitl/article/view/259167
Section
Research article

References

Regmi, S., Bal, B.K. and Kultsova, M. 2017. Analyzing facts and opinions in Nepali subjective texts. 2017 8th International Conference on Information, Intelligence, Systems & Applications (IISA), Larnaca, Cyprus, 1-4.

นงคราญ เจริญพงษ์. 2555. การแยกข้อเท็จจริง ข้อคิดเห็น. แหล่งข้อมูล : https://kunkrunongkran.wordpress.com/ภาษาไทย-ม-2/ภาษาไทย-ม-2-เทอม-2/การแยกข้อเท็จจริง-ข้อคิ/. ค้นเมื่อ 23 พฤษภาคม 2563.

Liu, B. 2010. Sentiment analysis and subjectivity. Handbook of natural language processing. 2nd Edition, Chapman and Hall/CRC, New York.

Ayutthaya, T.S.N. and Pasupa, K. 2018. Thai Sentiment Analysis via Bidirectional LSTM-CNN Model with Embedding Vectors and Sentic Features. 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Pattaya, Thailand, 1-6.

Krungklang, W. and Sinthupinyo, S. 2020. An Analysis of Natural Language Text Relating to Thai Criminal Law. 2020 12th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Bucharest, Romania, 1-6.

Zhang, Y. and Rao, Z. 2020. n-BiLSTM: BiLSTM with n-gram Features for Text Classification. 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 1056-1059.

Xu, G., Meng, Y., Qiu, X., Yu, Z. and Wu, X. 2019. Sentiment Analysis of Comment Texts Based on BiLSTM. IEEE Access, 7, 51522-51532.

Yao, T., Zhai, Z. and Gao, B. 2020. Text Classification Model Based on fastText. 2020 IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 154-157.

Hajj, N., Rizk, Y. and Awad, M. 2019. A subjectivity classification framework for sports articles using improved cortical algorithms. Neural Computing and Applications, 11(31), 8069-8085.

Pugsee, P. and Ongsirimongkol, N. 2020. A Classification Model for Thai Statement Sentiments by Deep Learning Techniques. Proceedings of the 2019 2nd International Conference on Computational Intelligence and Intelligent Systems, 22-27.