Features Extraction Based on Probability Weighting for Fake News Classification on Social Media

Main Article Content

Sherly Valentina
Wararat Songpan*

Abstract

Fake news is a massive problem globally, especially on social media. Most people spend a lot of time consuming social media every day, and it is very possible for people as social media users to receive fake news without realizing it. Primarily due to this situation, we developed a machine learning tool to detect fake news that operates with the aid of various algorithms such as Decision Tree, K-Nearest Neighbor, and Naïve Bayes. Our experiement is tested based on machine learning that selected only one technique used to classify the data by finding the model set. In addition, the performance of the set describes the classification of the model and the inconsistency solution for each iteration. This study proposed a model which used the probability weighting of the model in features extraction processing for data classification. The concept is the enhancement of probability weighting features that converge exactly the class labels of classification. Our work was also implemented based on traditional Count Vectorizer and TF-IDF Vectorizer sentiment analysis and combined probability weighting features for fake news articles. The experimental results of the work illustrate that the best accuracy achieved by a proposed model used probability weighting features to find out the impact of classifiers models. In addition, the results of experimental information is represented by enhancing the overall performance of Decision Tree, K-Nearest Neighbor, and Naïve Bayes with various datasets. In addition, the measures of precision, recall, F1-measure, AUC, and accuracy for each class and deep in each class were achieved and reached the highest performance of the proposed model.


Keywords: fake news; machine learning; sentiment analysis; probability weighting; data visualization


*Corresponding author: E-mail: [email protected]

Article Details

Section
Original Research Articles

References

Shu, K., Mahudeswaran, D., Wang, S., Lee, D. and Liu, H., 2018. FakeNewsNet: A data repository with news content, social context and Spatialtemporal information for studying fake news on social media. Big Data, 8(3), 171-188, DOI: 10.48550/arxiv.1809.01286.

Shu, K., Mahudeswaran, D. and Liu, H., 2018. Fakenewstracker: A tool for fake news collection, detection, and visualization. Computational and Mathematical Organization Theory, 25(1), 60-71, DOI: 10.1007/s10588-018-09280-3.

Yan, H., Wang, J. and Xia, C., 2017. Research and application of the test data visualization. Proceedings of 2017 IEEE 2nd International Conference on Data Science in Cyberspace (DSC), Shenzhen, China, 26-29 June 2017, pp. 661-665.

Alonso, M.A., Vilares, D., Gómez-Rodríguez, C. and Vilares, J., 2021. Sentiment analysis for fake news detection. Electronics, 10(11), 1-32, DOI: 10.3390/electronics10111348.

Dey, A., Rafi, R.Z., Parash, S.H, Arko, S.K. and Chakrabarty, A., 2018. Fake news pattern recognition using linguistic analysis. Proceedings of 2018 Joint 7th International Conference on Informatics, Electronics & Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Japan, 25-29 June 2018, pp. 305-309.

Raza, S. and Ding, C., 2022. Fake news detection based on news content and social contexts: a transformer-based approach. International Journal of Data Science and Analytics, 13(1), 335-362, DOI: 10.1007/s41060-021-00302-z.

Bedi, A., Pandey, N. and Khatri, S.K., 2019. A framework to identity and secure the issues of fake news and rumours in social networking. Proceedings of 2019 2nd International Conference on Power Energy, Environment and Intelligent Control (PEEIC), Noida, India, 18-19 October 2019, pp. 70-73.

Traylor, T., Straub, J., Gurmeet and Snell, N., 2019, Classifying fake news articles using natural language processing to identify in-article attribution as a supervised learning estimator. Proceedings of 2019 IEEE 13th International Conference on Semantic Computing (ICSC), New Port Beach, USA, 30 January-1 February 2019, pp. 445-449.

Ghinadya and Suyanto, S., 2020. Synonyms-based augmentation to improve fake news detection using bidirectional LSTM. Proceedings of 2020 8th International Conference on Information and Communication Technology (ICoICT), Yogyakarta, Indonesia, 24-26 June 2020, pp. 1-5.

Thakur, A., Shinde, S., Patil, T., Gaud, B. and Babanne, V., 2020. MYTHYA: Fake news detector, real time news extractor and classifier. Proceedings of 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184), Tirunelveli, India, 15-17 June 2020, pp. 982-987.

Kesarwani, A., Chauhan, S.S. and Nair, A.R., 2020. Fake news detection on social media using K-Nearest Neighbor classifier. Proceedings of 2020 International Conference on Advances in Computing and Communication Engineering (ICACCE), Las Vegas, USA, 22-24 June 2020, pp. 1-4.

Kaliyar, R.K., 2018, Fake news detection using a deep neural network. Proceedings of 2018 4th International Conference on Computing Communication and Automation (ICCCA), Noida, India, 14-15 December 2018, pp. 1-7.

Poddar, K., Geraldine, B.A.D. and Umadevi, K.S., 2019. Comparison of various machine learning models for accurate detection of fake news. Proceedings of 2019 Innovations in Power and Advanced Computing Technologies (i-PACT), Vellore, India, 22-23 March 2019, pp. 1-5.

Bhutani, B., Rastogi, N., Sehgal, P. and Purwar, A., 2019. Fake news detection using sentiment analysis. Proceedings of 2019 Twelfth International Conference on Contemporary Computing (IC3), Noida, India, 8-10 August 2019, pp. 1-5.

Song, C., Shu, K. and Wu, B., 2021. Temporally evolving graph neural network for fake news detection. Information Processing and Management, 58(6), 1-18, DOI: 10.1016/j.ipm.2021. 102712.

Yuan, H., Zheng, J., Ye, Q., Qian, Y. and Zhang, Y., 2021. Improving fake news detection with domain-adversarial and graph-attention neural network. Decision Support Systems, 151(1), 1-11, DOI: 10.1016/j.dss.2021.113633.