การเปรียบเทียบวิธีการแบ่งกลุ่มแบบเคมีนของรหัสพันธุกรรมเนื้องอกในสมอง สำหรับข้อมูลที่มีมิติขั้นสูง

อัชฌา อระวีพร; จารวี พร้อมสง่า

PDF

เผยแพร่แล้ว: ธ.ค. 30, 2023

คำสำคัญ:

การแบ่งกลุ่ม การแบ่งกลุ่มแบบเคมีน ฮาร์ติกัน-หว่อง ฟอร์กี้ แม็คควีน

อัชฌา อระวีพร

ภาควิชาสถิติ คณะวิทยาศาสตร์ สถาบันเทคโนโลยีพระจอมเกล้าเจ้าคุณทหารลาดกระบัง

จารวี พร้อมสง่า

ภาควิชาสถิติ คณะวิทยาศาสตร์ สถาบันเทคโนโลยีพระจอมเกล้าเจ้าคุณทหารลาดกระบัง

บทคัดย่อ

งานวิจัยนี้มีวัตถุประสงค์เพื่อเปรียบเทียบประสิทธิภาพของวิธีการแบ่งกลุ่มรหัสพันธุกรรมของคนไข้ที่ป่วยเป็นเนื้องอกในสมอง แบบเคมีน 3 วิธี ได้แก่ วิธีฮาร์ติกัน-หว่อง วิธีฟอร์กี้ และวิธีแม็คควีน มีตัวแปรอิสระเป็นข้อมูลรหัสพันธุกรรม 989 รหัส และตัวแปรตามคือระดับการเป็นเนื้องอกในสมองในคนไข้ 43 คน ซึ่งในกรณีนี้จำนวนตัวแปรอิสระจะมีจำนวนมากกว่าจำนวนคนไข้ หรือที่เรียกว่าข้อมูลที่มีมิติขั้นสูง โดยทำการทดลองสุ่มข้อมูลรหัสพันธุกรรมจำนวน 200, 400, 600 และ 800 และกำหนดจำนวนกลุ่มคือ 5, 10, 15, 20, 25 และ 30 ทำการสุ่มซ้ำ 1,000 ครั้ง เกณฑ์ที่ใช้ในการเปรียบเทียบประสิทธิภาพของการแบ่งกลุ่มคือค่าเฉลี่ยความแตกต่างของข้อมูลระหว่างกลุ่ม จากผลการแบ่งกลุ่มแบบเคมีนทั้งหมด 3 วิธี พบว่าวิธีการแบ่งกลุ่มด้วยวิธีฮาร์ติกัน-หว่องให้ประสิทธิภาพดีที่สุดสำหรับทุกสถานการณ์ ซึ่งให้ค่าความแตกต่างของข้อมูลระหว่างกลุ่มมากที่สุดเมื่อเปรียบเทียบกับวิธีฟอร์กี้และวิธีแม็คควีน โดยจำนวนตัวแปรอิสระไม่ส่งผลต่อประสิทธิภาพการแบ่งกลุ่ม

รูปแบบการอ้างอิง

อระวีพร อ., & พร้อมสง่า จ. (2023). การเปรียบเทียบวิธีการแบ่งกลุ่มแบบเคมีนของรหัสพันธุกรรมเนื้องอกในสมอง สำหรับข้อมูลที่มีมิติขั้นสูง. วารสารวิทยาศาสตร์ลาดกระบัง, 32(2), 67–79. สืบค้น จาก https://li01.tci-thaijo.org/index.php/science_kmitl/article/view/256290

ฉบับ

ปีที่ 32 ฉบับที่ 2 (2023): เดือนกรกฎาคม - ธันวาคม 2566

ประเภทบทความ

บทความวิจัย

อนุญาตภายใต้เงื่อนไข Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

เอกสารอ้างอิง

Zarikas, V., Poulopoulos, S.G., Gareiou, Z. and Zervas, E. 2020. Clustering analysis of countries using the COVID-19 cases dataset. Data in Brief, 31, 1-8.

Nurlaila, I., Irawati, W., Purwandari, K. and Pardamean, B. 2021. K-Means Clustering Model to Discriminate Copper-Resistant Bacteria as Bioremediation Agents. Procedia Computer Science, 179, 804-812.

Shan, P. 2018. Image segmentation method based on K-mean algorithm. EURASIP Journal on Image and Video Processing, 81, 1-9.

Hartigan, J.A. and Wong, M.A. 1979. Algorithm AS 136: A K-means Clustering Algorithm. Applied Statistics, 28, 100-108.

MacQueen. J. 1967. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, 281-297.

Forgy, E.W. 1965. Clustering Analysis of Multivariate Data: Efficiency vs Interpretability of Classifications. Biometrics, 21, 768-769.

Lloyd, S.P. 1982. Least Squares Quantization in PCM. IEEE Transactions on Information Theory, 28, 128-137.

Yadav, J. and Sharma, Monika. 2013. A Review of K-mean Algorithm. International Journal of Engineering Trends and Technology, 4(7), 2972-2976.

Singh, R. P. and Rajpoot, D. S. 2019. Efficient Identification of Initial Clusters Centers for Partitioning Clustering Methods. 2019 Fifth International Conference on Image Processing, Shimla, India, 131-136.

อาริกา ธรรมโน, มุทิตา หวังคิด และอาริต ธรรมโน. 2563. การพยากรณ์โรคมะเร็งเต้านมด้วยอัลกอริทึมการจำแนกประเภทแบบเคมีนร่วมกับค่าถ่วงน้ำหนักแบบปรับตัวเอง. วารสารวิทยาการและเทคโนโลยีสารสนเทศ, 10(2), 1-9. [Arika Thammmano, Muthita Wangkid and Arit Thammano, 2020. Breast Cancer Prediction Using K-mean Classification Algorithm with Self-adaptive Weight. Journal of Information Science and Technology, 10(2), 1-9. (in Thai)]

Jothi, R., Mohanty, S. K. and Ojha, A. 2017. DK-means: a deterministic K-means clustering algorithm for gene expression analysis. Pattern Analysis and Application, 22, 649-667.

Saadeh, H. Al Fayez, R. Q. and Elshqeirat, B. 2020. Application of K-Means Clustering to Identify Similar Gene Expression Patterns during Erythroid Development. International Journal of Machine Learning and Computing, 10(3), 452-457.

Joshi, R., Prasad, R., Mewada, P. and Saurabh. 2020. Modified LDA Approach For Cluster Based Gene Classification Using K-Mean Method. Procedia Computer Science, 171, 2493-2500.

Bhatt, V., Dhakar, M. and Chaurasia, B. K. 2016. Filtered Clustering Based on Local Outlier Factor in Data Mining. International Journal of Database Theory and Application, 9(5), 275-282.

Thakare, Y.S. and Bagal, S.B. 2015. Performance Evaluation of K-means Clustering Algorithm with Various Distance. International Journal of Computer Application, 110, 12-16.

Meng, Y., Liang, J., Cao, F. and He, Y. 2018. A New distance with derivative information for functional k-means clustering. Information Sciences, 463-464, 166-185.

Morissette, L. and Chartier. S. 2013. The k-means clustering technique: General considerations and implementation in Mathematica. Tutorials in Quantitative Methods for Psychology, 9(1), 15-24.

Article Sidebar

Main Article Content

บทคัดย่อ

Article Details

เอกสารอ้างอิง