การเลือกพารามิเตอร์การปรับสำหรับวิธีการถดถอยแบบลาสโซ่

Main Article Content

จุฑาทิพย์ นันทสุวรรณ
วิฐรา พึ่งพาพงศ์

Abstract

Abstract


This research is aimed to propose a method to select a tuning parameter for lasso regression by using regression diagnostics. Here we compare the results with the two popular approaches in lasso tuning parameter selection including cross-validation and Bayesian Information Criteria. Simulation studies in 6 cases emphasizing on violation of the linearity and homoscedasticity assumptions were carried out. The performance of three methods are compared in terms of false positive rate, false negative rate, prediction error, and estimation error. Our simulation studies show that regression diagnostics approach yields the lowest false positive rates while cross-validation method provides the lower false negative rates. In addition, regression diagnostics and cross-validation methods are comparable in terms of prediction error and estimation error. For the real data analysis, we applied all three methods with the Alzheimer's disease microarray data set. The results show that regression diagnostics is the most appropriate methods. 


Keywords: high-dimensional data; lasso regression; tuning parameter; regression diagnostics; cross-validation; Bayesian information criteria

Article Details

Section
Physical Sciences
Author Biographies

จุฑาทิพย์ นันทสุวรรณ

ภาควิชาสถิติ คณะพานิชยศาสตร์และการบัญชี จุฬาลงกรณ์มหาวิทยาลัย แขวงเมืองใหม่ เขตปทุมวัน กรุงเทพมหานคร 10330

วิฐรา พึ่งพาพงศ์

ภาควิชาสถิติ คณะพานิชยศาสตร์และการบัญชี จุฬาลงกรณ์มหาวิทยาลัย แขวงเมืองใหม่ เขตปทุมวัน กรุงเทพมหานคร 10330

References

[1] วิฐรา พึ่งพาพงศ์, 2558, บทวิเคราะห์วิธีวิเคราะห์การถดถอยเชิงเส้นสำหรับข้อมูลที่มีมิติสูง, ว.วิทยาศาสตร์และเทคโนโลยี 23: 212-223.
[2] Tibshirani, R., 1996, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B 58: 267-288.
[3] Chand, S., 2012, On tuning parameter selection of lasso-type methods: A Monte Carlo study, 9th International Bhurban Conference on Applied Sciences & Technology, National Centre for Physics (NCP), Islamabad.
[4] พิษณุ เจียวคุณ, 2550, การวิเคราะห์การถดถอย, สถานบริการวิทยาศาสตร์และเทคโนโลยี มหาวิทยาลัยเชียงใหม่, เชียงใหม่,
[5] สุพล ดุรงค์วัฒนา, 2558, Regression Models: Analytics-based Approach, บริษัท แดเน็กซ์ อินเตอร์คอร์ปอเรชั่น จำกัด, กรุงเทพฯ.
[6] Kutner, M.H., Nachtsheim, C.J., Neter, J. and Li, W., 2005, Applied Linear Statistical Models, 5th Ed., The McGraw-Hill Companies, Inc., Singapore.
[7] กัลยา วานิชย์บัญชา, 2552, การวิเคราะห์ข้อมูลหลายตัวแปร, สำนักพิมพ์จุฬาลงกรณ์มหาวิทยาลัย, กรุงเทพฯ, 589 น.
[8] Syed, A.R., 2011, A Review of Cross Validation and Adaptive Model Selection, M.S. Thesis, Department of Mathematics and Statistics, College of Arts and Sciences,
Georgia State University,
[9] Dezeure, R., Bühlmann, P., Meier, L. and Meinshausen, N., 2015, High-dimensional Inference: Confidence intervals, p-values and R-Software hdi, Stat. Sci. 30: 533-558.
[10] Blalock, E.M., Geddes, J.W., Chen, K.C., Porter, N.M., Markesbery, W.R. and Landfield, P.W., 2004, Incipient Alzheimer’s disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responses, PNAS. 101: 2173-2178.
[11] Shen, T., Ji, F., Yuan, Z. and Jiao, J., 2015, CHD2 is Required for embryonic neurogenesis in the developing cerebral cortex, Stem Cells 33: 1794-806.
[12] Li, Y., Chu, L.W., d, Li, Z., Yik, P.Y. and Song, Y.Q., 2009, A study on the association of the chromosome 12p13 locus with sporadic late-onset Alzheimer’s disease in Chinese, Dement Geriatr. Cogn. Disord. 27: 508-512.
[13] Lazarczyk, M.J., Haller, S., Savioz, A., Gimelli, S., Bena, F. and Giannakopoulos, P., 2017, Heterozygous deletion of Chorein exons 70-73 and GNA14 exons 3-7 in a Brazilian patient presenting with probable Tau-negative early-onset Alzheimer disease, Alzheimer Dis. Assoc. Disord. 31: 82-85.
[14] Kyratzi, E. and Efthimiopoulos, S., 2014, Calcium regulates the interaction of amyloid precursor protein with Homer3 protein, Neurobiol. Aging. 35: 2053-2063.
[15] Wilcock, D.M. and Griffin, W.S.T., 2013, Down’s syndrome, neuroinflammation, and Alzheimer neuropathogenesis, J. Neuroinflamm. 10: 84.
[16] Schneider, A., Huentelman, M.J., Kremerskothen, J., Duning, K., Spoelgen, R. and Nikolich, K., 2010, KIBRA: A new gateway to learning and memory?, Front. Aging Neurosci. 2: 4.
[17] Orre, M., Kamphuis, W., Osborn, L.M., Jansen, A.H.P., Kooijman, L., Bossers, K. and Hol, E.M., 2014, Isolation of glia from Alzheimer's mice reveals inflammation and dysfunction, Neurobiol. Aging. 35: 2746-2760.
[18] Ma, Z., Jiang, W. and Zhang, E.E., 2016, Orexin signaling regulates both the hippocampal clock and the circadian oscillation of Alzheimer’s disease-risk genes, Sci. Rep. 6: 36035.
[19] Wong, J., 2013, Altered expression of RNA splicing proteins in Alzheimer’s disease patients: Evidence from two microarray studies, Dement Geriatr. Cogn. Disord. Extra. 3: 74-85.
[20] Soler-López, M., Zanzoni, A., Lluís, R., Stelzl, U. and Aloy, P., 2011, Interactome mapping suggests new mechanistic details underlying Alzheimer’s disease, Genome Res. 21: 364-376.
[21] Wu, Y., Zhang, S., Xu, Q., Zou, H., Zhou, W., Cai, F., Li, T. and Song, W., 2015, Regulation of global gene expression and cell proliferation by APP, Sci. Rep. 6: 22460.
[22] Hu, Y.S., Xin, J., Hu, Y., Zhang, L. and Wang, J., 2017, Analyzing the genes related to Alzheimer’s disease via a network and pathway-based approach, AZRT, 9: 29.
[23] Yokoyama, J.S., Bonham†, L.W., Sears†, R.L., Klein, E., Karydas, A., Kramer, J.H., Miller, B.L. and Coppola, G., 2015, Decision tree analysis of genetic risk for clinically heterogeneous Alzheimer’s disease, BMC Neurol. 15: 47.