Comparison of Water Quality Classification Methods in Thailand
Main Article Content
Abstract
The objective of this research was to compare the classification of water quality with binary logistic regression analysis, decision tree, and K-Nearest Neighbors using the accuracy group classification as a criterion to compare the efficiency of each method. The data used in the study were from water sources throughout Thailand collected by the Water Quality Management Division, Pollution Control Department from January 1, 2018, to January 1, 2021. The water quality is classified into two groups; the water quality is standards and is not standards. A total of 14 independent variables were used in the study. The datasets were divided with a 10-fold cross-variation. The results showed that according to the binary logistic regression analysis, the decision tree selected variables, including water turbidity, total coliform bacteria, ammonia-nitrogen, fecal coliform bacteria, dissolved oxygen, and organic water, have the highest accuracy of 89.64%. The decision tree selected for all independent variables has an accuracy value of 88.71 %. While K Nearest Neighbors selected all independent variables have the lowest accuracy value of 79.05 %