Data Mining from Education-Related Search Suggested Text on the Web.

Main Article Content

Jincheng Zhang
Thada Jantakoon
Rukthin Laoha
Potsirin Limpinan

บทคัดย่อ

An increasing number of industries are implementing digital technology to increase efficacy due to its accelerated development. Undoubtedly, the education sector is not an exception. In addition to conducting data mining and analysis of Internet search-recommended text for the first time, this research developed and proposed the world's first algorithm and technology to obtain such text automatically and rapidly. By performing data mining on education-related search suggestion texts on the Internet, this study aims to extract educationally beneficial information. This study collects education-related search recommendation texts from YouTube using the platform as an example. Literature review and theoretical analysis; algorithm design and optimization; system design and implementation; support for applications; construction and annotation of data sets; empirical research and experimental validation; interdisciplinary research and application are some research methods employed. These are the outcomes of this research: Valuable insights were extracted from an analysis of over 50,000 lines of collected data, including popular keywords and the emotional proclivities of individuals. Thus, supporting the instructional and decision-making practices of individuals.

Article Details

บท
Articles

References

Higgins S, Xiao ZM, Katsipataki M. The impact of digital technology on learning: A summary for the education endowmenndation. Education endowment foundation; 2012.

Lin MH, Chen HC, Liu KS. A study of the effects of digital learning on learning motivation and learning outcome. Eurasia J Math Sci Tech Ed 2017;13:3553-64. doi:10.12973/eurasia.2017.00744a.

Picciano AG. The evolution of big data and learning analytics in American higher education. JALN 2012;16:9-20.

Romero C, Ventura S. Educational data mining: a review of the state of the art. IEEE 2010;40:601-18. doi:10.1109/

TSMCC.2010.2053532.

Baker RS. Data mining for education. In International Encyclopedia of Education (3rd edition). Oxford, UK:2012. p.112-18.

Wang F, Hannafin MJ. Design-based research and technology-enhanced learning environments. ETR&D 2005;53:5-23.

Baker RS, Yacef K. The state of educational data mining in 2009: A review and future visions. IEDMS 2009;1:3-17.

Pardos ZA, Heffernan NT. KT-IDEM: Introducing item difficulty to the knowledge tracing model. In: Proceedings of the 19th

International conference on user modeling, adaptation and personalization. Girona, Spain: 2011. P.105-19.

Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. Computer science department. CA: Stanford University; 1998.

Heydon A, Najork M. Mercator: A Scalable, Extensible web crawler. In Proceedings of the 7th International world wide web conference (WWW7); 1998 April 14-18: Brisbane. Australia: 1999. p.287-301.

Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases. AI magazine 1996;17:37. https://doi.org/10.1609/aimag.v17i3.1230

Chakrabarti S, Van den Berg M, Dom B. Focused crawling: A new approach to topic-specific web resource discovery. CN 2000.31:1623-40. doi:10.1016/S1389-1286(99)00052-3.

Han J, Kamber M, Fan M, Meng X. Data mining: concepts and techniques. China machine press. 2001.

Guyon I, Elisseeff A. An Introduction to variable and feature selection. JMLR 2003;3:1157-82.

Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE. 2009;8:1541-672. doi:10.1109/MIS.2009.103

Dave K, Lawrence S, Pennock DM. Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International world wide web conference (WWW 2003); 2003 May 20-24: Budapest, Hungary: 2003. p. 519-28.

Jones KS. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation 1972;28:11-21.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: advances in neural information processing systems: 2001 November 21, Deli, India: pp. 5998-6008. 2001.

Bayes T. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions 1993: 30:165-178.

Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. JMLR 2003;3:993-1022.

Hoffman MD, Bach FR, Blei DM. Online learning for latent dirichlet allocation. In: advances in neural Information processing systems: 2001 November 21, Deli, India. pp. 856-64. 2010.

MacQueen JB. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability; 1965 January 7 : University of California. 1967. p. 281-297.

van der Maaten L, Hinton G. Visualizing data using t-SNE. 008;9:2579-605.

Xue W, Li T. Aspect based sentiment analysis with gated convolutional networks. arXiv preprint arXiv:1805.07043.2018.

Williams HE, Zobel J, Bahle D. Fast phrase querying with combined indexes. TOIS 2004;22:573–94. https://doi.org/10.1145/

1028102

Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. PyTorch: An imperative style, high-performance deep learning library. arXiv:1912.01703 2019;10 :1-12.

Chandola V, Banerjee A, Kumar V. Anomaly detection: A survey. ACM computing surveys (CSUR). 2009;4:1-58.

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. arXiv:1912.01703 2014;13 :1-12.

Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: International conference on information processing in medical imaging; 2023 June 18-23: San Carlos De Bariloche: 2023. p. 146-157.

Hand DJ, Mannila H, Smyth P. Principles of data mining. MIT Press. 2001.

Witten IH, Frank E. CRISP-DM: towards a standard process model for data mining. In: Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining; 2000 April 11-13: Manchester: 2000. p. 29-40.

Batini C, Cappiello C, Francalanci C, Maurino A. Methodologies for data quality assessment and improvement. ACM computing surveys (CSUR) 2009;41:1-52.

Bishop CM. Pattern recognition and machine learning. Springer Link. Springer; 2006.

Fawcett T. An introduction to ROC analysis. Pattern recognition letters 2006;27:861-74.

Chen M, Mao S, Liu Y. Big data: A survey. IEEE access 2014;2:652-87.

Pang B, Lee L. Opinion mining and sentiment analysis. Foundations and trends® in information retrieval 2008;2:1-135.

Steinbach M, Karypis G, Kumar V. A comparison of document clustering techniques. In: Computer Science & Engineering (CS&E) Technical Reports: 2000 May 23: University of Minnesota: 2000. p. 1-20.

Knorr EM, Ng RT. Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24th international conference on very large data bases; 1998 August 24: New York: 1998. p. 392-403.