Learning analytics with machine learning for classification of student teachers' research engagement: The use of k-means and naïve Bayes

Authors

  • Laphatphitcha Surawatakul Faculty of Education, Chulalongkorn University
  • Chayut Piromsombut Faculty of Education, Chulalongkorn University
  • Suwimon Wongwanich Faculty of Education, Chulalongkorn University

Keywords:

Learning Analytics, Research Engagement, K-means, Naïve Bayes

Abstract

                Learning analytics on behavioral data of students in online research course could provide insightful information in classifying students’ research engagement. This study aimed to 1) cluster and discover the suitable number of groups of students on research engagement 2) classify students on research engagement and compare the performance of classification model between different number of groups, and 3) suggest the potential proxies for the classification of students on research engagement. This study used log data as proxies of research engagement. There were 253 participants involved in the research. The participants were student teachers who were studying in educational research course at the time of data collection. The data were analyzed with machine learning using k-means and naïve Bayes. The key research findings were as follows:

            1) The analysis by k-means revealed that the suitable number of groups in clustering students were 3 and 2 groups. The percentage of variance explained when clustering students into 3 groups was higher than 2 groups, namely 59.6 and 40.7, respectively. 2) The analysis by naïve Bayes showed that the model yielded accuracy of 86.67% in classifying 3 groups of students, while classifying 2 groups of students yielded accuracy of 84.21%, 3) The proxies having important roles in the classification of 3 and 2 groups of students on research engagement were largely found to be similar. Most of them were related to learning behaviors derived from the interaction with content or activity on the learning system. However, the ranking of variable importance was different.

References

สุชาดา ภู่ระหงษ์ และมนัสนันท์ หัตถศักดิ์. (2561). ปัจจัยที่ส่งผลต่ออัตลักษณ์ของนิสิตคณะวิศวกรรมศาสตร์ มหาวิทยาลัยแห่งหนึ่ง. วารสารการวัดผลการศึกษา.35(98). 119-132.

วิรัช วรรณรัตน์. (2561). ศักยภาพผลการเรียนรู้ของบัณฑิตผู้สำเร็จการศึกษา หลักสูตร ป.บัณฑิตวิชาชีพครู มหาวิทยาลัยราชพฤกษ์ปีการศึกษา 2557-2559. วารสารการวัดผลการศึกษา.35(98). 12-23.

Appleton, J. J., Christenson, S. L., Kim, D., & Reschly, A. L. (2006). Measuring cognitive and psychological engagement: Validation of the Student Engagement Instrument. Journal of school psychology. 44(5). 427-445.

Ayodele, T. O. (2010). Types of machine learning algorithms. New advances in machine learning. 3. 19-48.

Banerjee, P., Dehnbostel, F. O., & Preissner, R. (2018). Prediction is a balancing Act: importance of sampling methods to balance sensitivity and specificity of predictive models based on imbalanced chemical data sets. Frontiers in chemistry. 6. 362.

Barrientos, F., & Sainz, G. (2012). Interpretable knowledge extraction from emergency call data based on fuzzy unsupervised decision tree. Knowledge-based systems. 25(1). 77-87.

Del Mar, C., & Askew, D. (2004). Building family/general practice research capacity. The Annals of Family Medicine. 2(suppl 2). S35-S40.

Ferguson, R. (2012). Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning. 4(5/6). 304-317.

Károly, A. I., Fullér, R., & Galambos, P. (2018). Unsupervised clustering for deep learning: A tutorial survey. Acta Polytechnica Hungarica. 15(8). 29-53.

Kim, D., Park, Y., Yoon, M., & Jo, I. H. (2016). Toward evidence-based learning analytics: Usingproxy variables to improve asynchronous online discussionenvironments. The Internet and Higher Education. 30. 30-43.

Lester, D. (2013). A Review of the Student Engagement Literature. FOCUS on Colleges, Universities & Schools. 7(1).

Macfadyen, L. (2017). What Does a Learning Analytics Practitioner Need to Know?. CEURWorkshop Proceedings, (1915).

Ray, S. (2019). A quick review of machine learning algorithms. In 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon) (pp. 35-39).

Saritas, M. M., & Yasar, A. (2019). Performance analysis of ANN and Naive Bayes classification algorithm for data classification. International Journal of Intelligent Systems and Applications in Engineering. 7(2). 88-91.

Tarca, A. L., Carey, V. J., Chen, X. W., Romero, R., & Drăghici, S. (2007). Machine learning and its applications to biology. PLoS Comput Biol. 3(6). 116.

Zheng, B., Yoon, S. W., & Lam, S. S. (2014). Breast cancer diagnosis based on feature extractionusing a hybrid of K-means and support vector machine algorithms. Expert Systems with Applications. 41(4). 1476-1482.

Downloads

Published

2023-07-12