TY - JOUR AU - จันทราช, จิตกานต์ AU - ชัยมงคล, มนทิราลัย AU - แซ่โง้ว, รัตนชัย AU - พลอยสัมฤทธิ์, สายทิพย์ AU - สินสมบูรณ์ทอง, สายชล PY - 2019/12/16 Y2 - 2024/03/28 TI - การเปรียบเทียบประสิทธิภาพการทำนายผลการจำแนกกรณีข้อมูลสูญหายด้วยเทคนิคการทำเหมืองข้อมูล JF - Thai Journal of Science and Technology JA - Thai J. Sci. Technol. VL - 9 IS - 1 SE - วิทยาศาสตร์กายภาพ DO - 10.14456/tjst.2020.2 UR - https://li01.tci-thaijo.org/index.php/tjst/article/view/229672 SP - 1-15 AB - <p>The objective of this research was to compare the efficiencies of four classification methods: K-nearest neighbor, decision tree, artificial neural network and support vector machine, on three datasets with some missing data. The tested datasets, i.e. a dataset of incidents of liver disease in Andhra Pradesh, India, a dataset of annual incomes and expenditures of Filipino families, and a dataset of issued and non-issued credit cards by a bank data points were constructed to replace the missing data by five replacement methods: series mean, mean of nearby points, median of nearby points, linear interpolation and linear trend at a point, offered in SPSS software program. The metrics that indicated the efficiency of a classification method were the prediction accuracy and the mean squared error of classification. Each dataset was divided into three subsets: a learning set, a validation set and a test set, at a ratio of 70 : 20 : 10. For the classification of the dataset of incidents of liver disease in Andhra Pradesh, it had missing data 1.89 percentages and had the least amount of missing data. The most accurate outcomes were from the highest mean of precision for the outcomes and the lowest mean of mean squared error were from the artificial neural network method with missing data replaced by the mean of nearby points method. For the classification of the dataset of annual incomes and expenses of Filipino families, it had missing data 4.21 percentages and had a moderate amount of missing data. The most accurate outcomes were from the artificial neural network method with missing data replaced by the linear interpolation method. For the classification of the dataset of issued and non-issued credit cards by a bank, it had missing data 9.72 percentages and had the highest amount of missing data. The most accurate outcomes were from the artificial neural network method with missing data replaced by the series mean method.</p> ER -