Forward selection models for classifying mild cognitive impairment and Alzheimer’s disease based on single nucleotide polymorphisms
Main Article Content
Abstract
Early detection of Alzheimer’s disease (AD) is crucial for patients to begin treatment early to slow the disease’s progression. While mild cognitive impairment (MCI) is considered an early translational stage of AD, clinically diagnosing MCI is difficult due to its inconsistent symptoms and the lack of standardized diagnostic tests. In this work, we proposed forward selection models to classify patients with AD, patients with MCI and healthy controls (HCs) based on single nucleotide polymorphisms (SNPs). In the proposed method, the initial SNP data were prescreened via genome-wide association studies with a suggestive significance threshold. Then, the qualified SNPs were reselected using the forward SNP selection algorithm to create classification models. Consequently, the forward selection models significantly outperformed the preselection models, those based on all prescreened SNPs, with an area under the precision-recall curve (AUPRC) value of 0.93 in the AD-HC classification, an AUPRC value of 0.94 in the MCI-HC classification, and an AUPRC value of 0.81 in the AD-MCI classification. Moreover, the proposed method could identify AD-associated and MCI-associated SNPs, which would support the clinical diagnosis of AD and MCI in the future.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Ahmed, H., Soliman, H., and Elmogy, M. (2020). Early detection of Alzheimer’s disease based on single nucleotide polymorphisms (SNPs) analysis and machine learning techniques. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), pp. 1–6. Sakheer, Bahrain.
Alatrany, A. S., Khan, W., Hussain, A. J., Mustafina, J., and Al-Jumeily, D. (2023). Transfer learning for classification of Alzheimer's disease based on genome wide data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 20(5), 2700–2711.
Baek, M. S., Kim, H.-K., Han, K., Kwon, H.-S., Na, H. K., Lyoo, C. H., and Cho, H. (2022). Annual trends in the incidence and prevalence of Alzheimer's disease in South Korea: A nationwide cohort study. Frontiers in Neurology, 13, 883549.
Bidzan, L., Grabowski, J., Przybylak, M., and Ali, S. (2023). Aggressive behavior and prognosis in patients with mild cognitive impairment. Dementia and Neuropsychologia, 17, e20200096.
Breijyeh, Z., and Karaman, R. (2020). Comprehensive review on Alzheimer's disease: Causes and treatment. Molecules, 25(24), e20200096.
Cacace, R., Heeman, B., Van Mossevelde, S., De Roeck, A., Hoogmartens, J., De Rijk, P., Gossye, H., De Vos, K., De Coster, W., Strazisar, M., De Baets, G., Schymkowitz, J., Rousseau, F., Geerts, N., De Pooter, T., Peeters, K., Sieben, A., Martin, J. J., Engelborghs, S., Salmon, E., Santens, P., Vandenberghe, R., Cras, P., De Deyn, P. P., Swieten, J. C., Duijn, C. M., Zee, J., Sleegers, K., and Van Broeckhoven, C. (2019). Loss of DPP6 in neurodegenerative dementia: A genetic player in the dysfunction of neuronal excitability. Acta Neuropathologica, 137, 901–918.
Chen, T., and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. California, USA.
Chen, Y.-X., Liang, N., Li, X.-L., Yang, S.-H., Wang, Y.-P., and Shi, N.-N. (2021a). Diagnosis and treatment for mild cognitive impairment: A systematic review of clinical practice guidelines and consensus statements. Frontiers in Neurology, 12, 719849.
Chen, Z., Boehnke, M., Wen, X., and Mukherjee, B. (2021b). Revisiting the genome-wide significance threshold for common variant GWAS. G3 Genes Genomes Genetics, 11(2), jkaa056.
Cho, K. H., Park, H. J., Kim, S. J., and Kim, J. R. (2019). Decrease in HDL-C is associated with age and household income in adults from the Korean National Health and Nutrition examination survey 2017: Correlation analysis of low HDL-C and poverty. International Journal of Environmental Research and Public Health, 16(18), 3329.
Cortes, C., and Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
Cover, T., and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
Devroye, L., Györfi, L., and Lugosi, G. (1996). Introduction. In A Probabilistic Theory of Pattern Recognition. Stochastic Modelling and Applied Probability, vol 31 (Devroye, L., Györfi, L., and Lugosi, G., Eds.), pp. 1–8. New York: Springer.
Hammond, R. K., Pahl, M. C., Su, C., Cousminer, D. L., Leonard, M. E., Lu, S., Doege, C. A., Wagley, Y., Hodge, K. M., Lasconi, C., Johnson, M. E., Pippin, J. A., Hankenson, K. D., Leibel, R. L., Chesi, A., Wells, A. D., and Grant, S. F. (2021). Biological constraints on GWAS SNPs at suggestive significance thresholds reveal additional BMI loci. eLife, 10, e62206.
Haykin, S. S. (2009). Neural Networks and Learning Machines, 3rd, New Jersey: Pearson Prentice Hall, pp. 122–200.
Ho, T. K. (1995). Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, pp. 278–282. Quebec, Canada.
Jeong, D., Yoo, C., Yeh, S.-W., Yoon, J.-H., Lee, D., Lee, J.-B., and Choi, J.-Y. (2022). Statistical seasonal forecasting of winter and spring PM2.5 concentrations over the Korean peninsula. Asia-Pacific Journal of Atmospheric Sciences, 58, 549–561.
Jo, T., Nho, K., Bice, P., and Saykin, A. J. (2022). Deep learning-based identification of genetic variants: Application to Alzheimer's disease classification. Briefings in Bioinformatics, 23(2), bbac022.
Kim, D. H., Yeo, S. H., Park, J.-M., Choi, J. Y., Lee, T.-H., Park, S. Y., Ock, M. S., Eo, J., Kim, H.-S., and Cha, H.-J. (2014). Genetic markers for diagnosis and pathogenesis of Alzheimer's disease. Gene, 545(2), 185–193.
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., de Bakker, P. I., Daly, M. J., and Sham, P. C. (2007). PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81(3), 559–575.
Rasmussen, J., and Langerman, H. (2019). Alzheimer's disease - Why we need early diagnosis. Degenerative Neurological and Neuromuscular Disease, 9, 123–130.
Salcedo-Sanz, S., Cornejo-Bueno, L., Prieto, L., Paredes, D., and García-Herrera, R. (2018). Feature selection in machine learning prediction systems for renewable energy applications. Renewable and Sustainable Energy Reviews, 90, 728–741.
Schmidt-Morgenroth, I., Michaud, P., Gasparini, F., and Avrameas, A. (2023). Central and peripheral inflammation in mild cognitive impairment in the context of Alzheimer's disease. International Journal of Molecular Sciences, 24(13), 10523.
Sollis, E., Mosaku, A., Abid, A., Buniello, A., Cerezo, M., Gil, L., Groza, T., Gunes, O., Hall, P., Hayhurst, J., Ibrahim, A., Ji, Y., John, S., Lewis, E., MacArthur, J. A. L., McMahon, A., Osumi-Sutherland, D., Panoutsopoulou, K., Pendlington, Z., Ramachandran, S., Stefancsik, R., Stewart, J., Whetzel, P., Wilson, R., Hindorff, L., Cunningham, F., Lambert, S. A., Inouye, M., Parkinson, H., and Harris, L. W. (2023). The NHGRI-EBI GWAS Catalog: Knowledgebase and deposition resource. Nucleic Acids Research, 51(D1), D977–D985.
Tangmanussukum, P., Kawichai, T., Suratanee, A., and Plaimas, K. (2022). Heterogeneous network propagation with forward similarity integration to enhance drug–target association prediction. PeerJ Computer Science, 8, e1124.
Yamazaki, Y., Zhao, N., Caulfield, T. R., Liu, C. C., and Bu, G. (2019). Apolipoprotein E and Alzheimer's disease: Pathobiology and targeting strategies. Nature Reviews Neurology, 15, 501–518.
Zhou, Q., Zhao, F., Lv, Z. P., Zheng, C. G., Zheng, W. D., Sun, L., Wang, N. N., Pang, S., de Andrade, F. M., Fu, M., He, X. H., Hui, J., Jiang, W., Yang, C. Y., Shi, X. H., Zhu, X. Q., Pang, G. F., Yang, Y. G., Xie, H. Q., Zhang, W. D., Hu, C. Y., and Yang, Z. (2014). Association between APOC1 polymorphism and Alzheimer's disease: A case-control study and meta-analysis. PLoS One, 9(1), e87017.