Topic Modeling and Text Clustering of Foreign Tourists' Reviews on Khao Yai National Park Using Unsupervised Learning

Main Article Content

Chakkarin Santirattanaphakdi
Suphakit Niwattanakul

Abstract

This research on topic modeling combined with text clustering of international tourists’ reviews on Khao Yai National Park utilizing unsupervised learning techniques collected a total of 3,491 reviews from various online platforms between January 1, 2023, and December 31, 2024. The data underwent a cleaning process before being transformed into numerical vectors using the Term Frequency-Inverse Document Frequency (TF-IDF) technique. Topic modeling was then conducted using the Latent Dirichlet Allocation (LDA) method. The analysis identified five main topics: 1) tourist attractions and the natural environment, 2) service quality and available facilities, 3) costs and value for money, 4) transportation and accessibility, and 5) personal experiences and enjoyment. These results correspond with the findings from K-means clustering of negative reviews, which were categorized into three major groups: 1) entrance fees and pricing structures, 2) accessibility and service satisfaction, and 3) facilities and tourism activities. These factors were found to significantly influence tourist satisfaction. Addressing the identified shortcomings can support the development of appropriate strategies to meet the diverse needs of tourists. Furthermore, the application of big data in systematic policy development is instrumental in shaping responsive management approaches that effectively tackle emerging issues and contribute to the long-term sustainability of Thailand’s tourism industry.

Article Details

How to Cite
Santirattanaphakdi, C. ., & Niwattanakul, S. . (2025). Topic Modeling and Text Clustering of Foreign Tourists’ Reviews on Khao Yai National Park Using Unsupervised Learning. Journal of SciTech-ASEAN, 5(1), 62–78. retrieved from https://li01.tci-thaijo.org/index.php/STJS/article/view/266565
Section
Research Article

References

Ali, T., Omar, B., & Soulaimane, K. (2022). Analyzing tourism reviews using an LDA topic-based sentiment analysis approach. MethodsX, 9, 101894.

Blei, DM., Ng, AY., & Jordan, MI. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(2003), 993–1022.

Buhalis, D. (2000). Marketing the competitive destination of the future. Tourism Management, 21(1), 97–116

George, OA., & Ramos, CMQ. (2024). Sentiment analysis applied to tourism: Exploring tourist-generated content in the case of a wellness tourism destination. International Journal of Spa and Wellness, 7(2), 139–161.

Kozak, M., & Kozak, N. (2018). Tourist behavior (1st ed.). Springer International Publishing.

Kumar, N., Sousa, BB., & Sharma, S. (2022). Tourist behavior: Past, present, and future. CRC Press/Apple Academic Press.

Ministry of Tourism and Sports. (2024). Preliminary statistics on international tourists entering Thailand, Ministry of Tourism and Sports. Retrieved from https://www.mots.go.th/news/category/759. (in Thai)

National Statistical Office, Ministry of Digital Economy and Society. (2022). Number of tourists in national parks, fiscal year 2022. Retrieved from https://directory.gdcatalog.go.th/Dataset/Content/83aedce9-4cc1-4127-9d44-1b2f7c423add. Accessed ???? (in Thai)

Nawawi, I., Ilmawan, KF., Maarif, MR., & Syafrudin, M. (2024). Exploring tourist experience through online reviews using aspect-based sentiment analysis with zero-shot learning for hospitality service enhancement. Information, 15(8), 499.

Soltani-Nejad, N., Rastegar, R., Shahriari-Mehr, G., & Taheri-Azad, F. (2024). Conceptualizing tourist journey: Qualitative analysis of tourist experiences on TripAdvisor. Journal of Quality Assurance in Hospitality & Tourism, 25(2), 343-364.

Suntornteerasut, P., Darawong, C., & Wongvedvanij, R. (2024). The impact of travel experience on tourist satisfaction and intention to revisit towards wellness tourism in Thailand. Journal of Business, Innovation and Sustainability, 19(4), 160-179. (in Thai)

Wonglapas, S. & Rakha, C. (2024). Structural equation modeling analysis of service quality affecting tourist loyalty in the Mekong River lifestyle-based tourism development area for sustainable grassroots economy development. Southern Technology Journal, 17(1), 14–26. (in Thai)

Zeithaml, VA. (1988). Consumer perceptions of price, quality, and value: A means-end model and synthesis of evidence. Journal of Marketing, 52(3), 2–22.