Predicting Heart Disease Using FTGM-PCA Based Informative Entropy Based-Random Forest

Main Article Content

Deepika Deenathayalan*
Balaji Narayanan

Abstract

In recent years, heart disease has become a reason for high mortality rate, and data mining has also gained attention in the medical domain. Predicting this disease in its initial stage helps to save lives and reduce treatment costs. Various classification models were recently introduced with expected outcomes. However, they lacked prediction accuracy. Hence, the aim of this study was to employ data mining techniques for predicting heart disease, and focused on higher accuracy. This disease was predicted by considering the Cleveland heart disease dataset, employing deep CNN models for extracting relevant features, and performing feature level fusion related to its efficient and automatic learning. FGM-PCA (Fast Track Gram Matrix-Principal Component Analysis) was proposed for dimensionality reduction and fusion to solve overfitting issues, minimise time and space complexity, eliminate redundant data, and enhance classifier performance. Further, effective classification was achieved through the newly introduced IEB-RF (Informative Entropy Based-Random Forest) because it offers high accuracy and can also handle a large amount of data flexibly. The proposed system was evaluated in terms of accuracy, sensitivity, F1-score, AUC (Area under Curve) and precision. The results revealed the superior performance of the introduced system in comparison to traditional techniques.


Keywords: heart disease prediction; FTGM-PCA; informative entropy based-random forest; dimensionality eduction; cleveland heart disease dataset


*Corresponding author: Tel.: (+91) 9647533289


                                             E-mail: deepikaphd11@gmail.com

Article Details

Section
Original Research Articles

References

Beyene, C. and Kamat, P., 2018. Survey on prediction and analysis the occurrence of heart disease using data mining techniques. International Journal of Pure and Applied Mathematics, 118(8), 165-174.

Preetha, J., Raju, S., Kumar, A., Sayyad, S. and Vengatesan, R., 2020. Data mining technique based critical disease prediction in medical field. In: D.J. Hemanth, V.D.A. Kumar and S. Malathi, eds. Advances in Pararllel Computing. Vol. 37. Intelligent Systems and Computer Technology. Amsterdam: IOS Press, pp.104-108.

Diwakar, M., Tripathi, A., Joshi, K., Memoria, M. Singh, P. and Kumar, N., 2021. Latest trends on heart disease prediction using machine learning and image fusion. Materials Today: Proceedings, 37, 3213-3218, DOI: 10.1016/j.matpr.2020.09.078.

Bharti, R., Khamparia, A., Shabaz, M., Dhiman, G., Pande, S. and Singh, P., 2021. Prediction of heart disease using a combination of machine learning and deep learning. Computational Intelligence and Neuroscience, 2021, DOI: 10.1155/2021/8387680.

Sharma, P., Choudhary, K., Gupta, K., Chawla, R., Gupta, D. and Sharma, A., 2020. Artificial plant optimization algorithm to detect heart rate and presence of heart disease using machine learning. Artificial Intelligence in Medicine, 102, DOI: 10.1016/j.artmed.2019.101752.

Jain, D. and Singh, V., 2018. Feature selection and classification systems for chronic disease prediction: A review. Egyptian Informatics Journal, 19(3), 179-189, DOI: 10.1016/j.eij. 2018.03.002.

Maji, S. and Arora, S., 2019. Decision tree algorithms for prediction of heart disease. Proceedings of Third International Conference on Information and Communication Technology for Competitive Strategies, Udaipur, India, December 16-17, 2017, pp. 447-454.

Spencer, R., Thabtah, F., Abdelhamid, N. and Thompson, M., 2020. Exploring feature selection and classification methods for predicting heart disease. Digital Health, 2020, DOI: 10.1177/2055207620914777.

Javeed, A., Rizvi, S.S., Zhou, S., Riaz, R., Khan, S.U. and Kwon, S.J., 2020. Heart risk failure prediction using a novel feature selection method for feature refinement and neural network for classification. Mobile Information Systems, 2020, DOI: 10.1155/2020/8843115.

Nalluri, S., Saraswathi, V., Ramasubbareddy, S., Govinda, K. and Swetha, E., 2020. Chronic heart disease prediction using data mining techniques. In: K. Raju, R. Senkerik, S. Lanka and V. Rajagopal, eds. Data Engineering and Communication Technology. Singaporer: Springer, pp. 903-912.

Dulhare, U.N., 2018. Prediction system for heart disease using Naive Bayes and particle swarm optimization. Biomedical Research, 29(12), 2646-2649.

Amin, M.S., Chiam, Y.K. and Varathan, K.D., 2019. Identification of significant features and data mining techniques in predicting heart disease. Telematics and Informatics, 36, 82-93, DOI: 10.1016/j.tele.2018.11.007.

Shah, S.M.S., Batool, S., Khan, I., Ashraf, M.U., Abbas, S.H. and Hussain, S.A., 2017. Feature extraction through parallel probabilistic principal component analysis for heart disease diagnosis. Physica A: Statistical Mechanics and its Applications, 482, 796-807, DOI: 10.1016/j.physa.2017.04.113.

Lv, N., Chen, C., Qiu, T. and Sangaiah, A.K., 2018. Deep learning and superpixel feature extraction based on contractive autoencoder for change detection in SAR images. IEEE Transactions on Industrial Informatics, 14(12), 5530-5538, DOI: 10.1109/TII.2018.2873492.

Le, H.M., Tran, T.D. and Tran, L.V., 2018. Automatic heart disease prediction using feature selection and data mining technique. Journal of Computer Science and Cybernetics, 34(1), 33-48, DOI: 10.15625/1813-9663/34/1/12665.

Magesh, G. and Swarnalatha, P., 2021. Optimal feature selection through a cluster-based DT learning (CDTL) in heart disease prediction. Evolutionary Intelligence, 14(2), 583-593, DOI: 10.1007/s12065-019-00336-0.

Azhar, M. and Thomas, P.A., 2020. Heart disease prediction based on an optimal feature selection method using autoencoder. International Journal of Scientific Research in Science and Technology, 7(4), 25-38, DOI: 10.32628/IJSRST20748.

Keerthika, T. and Premalatha, K., 2019. An effective feature selection for heart disease prediction with aid of hybrid kernel SVM. International Journal of Business Intelligence and Data Mining, 15(3), 306-326, DOI: 10.1504/IJBIDM.2019.101977.

Vijayashree, J. and Sultana, H.P., 2018. A machine learning framework for feature selection in heart disease classification using improved particle swarm optimization with support vector machine classifier. Programming and Computer Software, 44(6), 388-397, DOI: 10.1134/S0361768818060129.

Harimoorthy, K. and Thangavelu, M., 2021. Multi-disease prediction model using improved SVM-radial bias technique in healthcare monitoring system. Journal of Ambient Intelligence and Humanized Computing, 12(3), 3715-3723, DOI: 10.1007/s12652-022-03971-1.

Wiharto, W., Kusnanto, H. and Herianto, H., 2017. System diagnosis of coronary heart disease using a combination of dimensional reduction and data mining techniques: A review. Indonesian. Journal of Electrical Engineering and Computer Science, 7(2), 514-523, DOI: 10.11591/ijeecs.v7.i2.pp514-523.

Burse, K., Kirar, V.P.S., Burse, A. and Burse, R., 2019. Various preprocessing methods for neural network based heart disease prediction. In: S. Tiwari, M. Trivedi, K. Mishra, A. Misra and K. Kumar, eds. Smart Innovations in Communication and Computational Sciences. Singapore: Springer, pp. 55-65.

Nilashi, M., Ahmadi, H., Manaf, A.A., Rashid, T.A., Samad, S. and Shahmoradi, L., 2020. Coronary heart disease diagnosis through self-organizing map and fuzzy support vector machine with incremental updates. International Journal of Fuzzy Systems, 22(4), 1376-1388, DOI: 10.1007/s40815-020-00828-7.

Thiyagaraj, M. and Suseendran, G., 2018. An efficient heart disease prediction system using modified firefly algorithm based radial basis function with support vector machine. International Journal of Engineering and Technology, 7(2.33), 1040-1045.

Sujatha, R., Ephzibah, E., Dharinya, S., Maheswari, G.U., Mareeswari, V. and Pamidimarri, V., 2018. Comparative study on dimensionality reduction for disease diagnosis using fuzzy classifier. International Journal of Engineering and Technology, 7(1), 79-84.

Rao, G.M., Kumar, T.R. and Reddy, A.R., 2020. CNN-BD: An approach for disease classification and visualization. In: S. Borah, V.E. Balas and Z. Polkowski, eds. Advances in Data Science and Management. Singapore: Springer, pp. 149-157.

Uddin, S., Khan, A., Hossain, M.E. and Moni, M.A., 2019. Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19(1), 1-16, DOI: 10.1186/s12911-019-1004-8.

Mohan, S., Thirumalai, C. and Srivastava, G., 2019. Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 7, 81542-81554, DOI: 10.1109/ACCESS. 2019.2923707.

Latha, C.B.C. and Jeeva, S.C., 2019. Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Informatics in Medicine Unlocked, 16, DOI: 10.1016/j.imu.2019.100203.

Budholiya, K., Shrivastava, S.K. and Sharma, V., 2020. An optimized XGBoost based diagnostic system for effective prediction of heart disease. Journal of King Saud University-Computer and Information Sciences, 34(7), 4514-4523.

Kaya, M.O., 2021. Performance evaluation of multilayer perceptron artificial neural network model in the classification of heart failure. The Journal of Cognitive Systems, 6(1), 35-38, DOI: 10.52876/jcs.913671.

Selvi, R.T. and Muthulakshmi, I., 2021. An optimal artificial neural network based big data application for heart disease diagnosis and classification model. Journal of Ambient Intelligence and Humanized Computing, 12(6), 6129-6139, DOI: 10.1007/s12652-022-04077-4.

Tougui, I., Jilbab, A. and Mhamdi, J.E., 2020. Heart disease classification using data mining tools and machine learning techniques. Health and Technology, 10, 1137-1144, DOI: 10.1007/s12553-020-00438-1.

Mathan, K., Kumar, P.M., Panchatcharam, P., Manogaran, G. and Varadharajan, R., 2018. A novel Gini index decision tree data mining method with neural network classifiers for prediction of heart disease. Design Automation for Embedded Systems, 22(3), 225-242, DOI: 10.1007/s10617-018-9205-4.

Ali, L. Rahman, A., Khan, A., Zhou, M., Javeed, A. and Khan, J.A., 2019. An automated diagnostic system for heart disease prediction based on 2 statistical model and optimally configured deep neural network. IEEE Access, 7, 34938-34945, DOI: 10.1109/ACCESS.2019.2904800.

Das, H., Naik, B. and Behera, H., 2020. Medical disease analysis using neuro-fuzzy with feature extraction model for classification. Informatics in Medicine Unlocked, 18, DOI: 10.1016/j.imu.2019.100288.

Shah, D., Patel, S. and Bharti, S.K., 2020. Heart disease prediction using machine learning techniques. SN Computer Science, 1(6), 1-6, DOI: 10.1007/s42979-020-00365-y.

Mienye, I.D., Sun, Y. and Wang, Z., 2020. An improved ensemble learning approach for the prediction of heart disease risk. Informatics in Medicine Unlocked, 20, DOI: 10.1016/j. imu.2020.100402.

Tama, B.A., Im, S. and Lee, S., 2020. Improving an intelligent detection system for coronary heart disease using a two-tier classifier ensemble. BioMed Research International, 2020, DOI: 10.1155/2020/9816142.