The role of data science in online education




Data Science, Big Data, Data Analytics, Learning Analytics, Online Education


This article offers a review of the major literature about data science and its implications over the past 15 years. It agrees that choosing to study data science at the point of choice requires a research focus on big data, data analytics, learning analytics, machine learning with modelling, and prediction. Starting from a consideration about the importance of data science with three types of data, big data includes five properties (five Vs), machine learning with the many concerns of influence, data analytics with four types of data analytics, learning analytics, online education (MOOCs and blended learning) related to big data, and data science in online education with perspectives on a ‘data scientist’ and benefits of data science for educational institutions. In the online education environment of MOOCs and blended learning, big data and learning analytics are implemented to analyse students’ learning and track their learning success efficiently. The literature itself points to the crucial importance of the quality and management of data. In particular, major implications, especially online education, are summarized, and they can be utilized as a reference for researchers.


Download data is not yet available.


Aasheim, C. L., Williams, S., Rutner, P., & Gardiner, A. (2015). Data analytics vs. data science: A study of similarities and differences in undergraduate programs based on course descriptions. Journal of Information Systems Education, 26(2), 103-115.

Anderson, C., Wiley, Anderson, C., & Swanstrom, R. (2015). Data Science for Dummies. Somerset, UNITED STATES: John Wiley & Sons, Incorporated.

Asamoah, D., Doran, D., & Schiller, S. (2015). Teaching the foundations of data science: an interdisciplinary approach. arXiv preprint arXiv:1512.04456.

Baldassarre, M. (2016). Think big: learning contexts, algorithms and data science. Research on Education and Media, (2), 69. doi:10.1515/rem-2016-0020

Berman, F., Rutenbar, R., Hailpern, B., Christensen, H., Davidson, S., Estrin, D., Stodden, V. (2018). Realizing the potential of data science. Communications of the ACM, 61(4), 67-72.

Blum, A., Hopcroft, J., & Kannan, R. (2016). Foundations of data science. Vorabversion eines Lehrbuchs.

Cao, L. (2017). Data science: a comprehensive overview. ACM Computing Surveys (CSUR), 50(3), 1-42.

Cao, L. (2019). Data Science: Profession and Education. IEEE Intelligent Systems, Intelligent Systems, IEEE, IEEE Intell. Syst., 34(5), 35-44. doi:10.1109/MIS.2019.2936705

Castro, R. (2019). Blended learning in higher education: Trends and capabilities. Education and Information Technologies, 24(4), 2523-2546.

Daniel, B. K. (2019). Big Data and data science: A critical review of issues for educational research. British Journal of Educational Technology, 50(1), 101-113.

Davenport, T. H., & Patil, D. (2012). Data scientist. Harvard business review, 90(5), 70-76.

Dhar, V. (2012). Data science and prediction. Communications of the ACM., 56(12), 64-73.

Douglas, K. A., Bermel, P., Alam, M. M., & Madhavan, K. (2016). Big data characterization of learner behaviour in a highly technical MOOC engineering course. Journal of Learning Analytics, 3(3), 170-192.

Ferguson, R. (2012). Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5-6), 304-317.

Gal, Y., & Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. Paper presented at the international conference on machine learning.

Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International journal of information management, 35(2), 137-144.

Gao, S. (2020). Innovative Teaching of Integration of Artificial Intelligence and University Mathematics in Big Data Environment. MS&E, 750(1), 012137.

George, G., Osinga, E. C., Lavie, D., & Scott, B. A. (2016). Big data and data science methods for management research. Academy of Management Journal, 59 (5), 1493-1507.

Grover, P., & Kar, A. K. (2017). Big data analytics: a review on theoretical contributions and tools used in literature. Global Journal of Flexible Systems Management, 18(3), 203-229.

Gupta, S., Kar, A. K., Baabdullah, A., & Al-Khowaiter, W. A. (2018). Big data with cognitive computing: a review for the future. International journal of information management, 42, 78-89.

Huang, Y., & Jin, X. (2018). Innovative college English teaching modes based on Big Data. Educational Sciences: Theory & Practice, 18(6).

Huda, M., Anshari, M., Almunawar, M. N., Shahrill, M., Tan, A., Jaidin, J. H., Masri, M. (2016). Innovative teaching in higher education: the big data approach. TOJET.

Hwang, G.-J., Spikol, D., & Li, K.-C. (2018). Guest Editorial: Trends and Research Issues of Learning Analytics and Educational Big Data. Journal of Educational Technology & Society, 21(2), 134.

Kaliisa, R., Kluge, A., & Mørch, A. I. (2020). Combining checkpoint and process learning analytics to support learning design decisions in blended learning environments. Journal of Learning Analytics, 7(3), 33-47.

KlaA nja-Milicevic, A., Ivanovic, M., & Budimac, Z. (2017). Data science in education: Big data and learning analytics. Computer Applications in Engineering Education, (6), 1066. doi:10.1002/cae.21844

Kloft, M., Stiehler, F., Zheng, Z., & Pinkwart, N. (2014). Predicting MOOC dropout over weeks using machine learning methods. Paper presented at the Proceedings of the EMNLP 2014 workshop on analysis of large scale social interaction in MOOCs.

L’heureux, A., Grolinger, K., Elyamany, H. F., & Capretz, M. A. (2017). Machine learning with big data: Challenges and approaches. IEEE Access, 5, 7776-7797.

Lu, O. H., Huang, A. Y., Huang, J. C., Lin, A. J., Ogata, H., & Yang, S. J. (2018). Applying learning analytics for the early prediction of Students' academic performance in blended learning. Journal of Educational Technology & Society, 21(2), 220-232.

Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., & Loumos, V. (2009). Dropout prediction in e-learning courses through the combination of machine learning techniques. Computers & Education, 53(3), 950-965.

Musabirov, I. i. h. r., Pozdniakov, S., & Tenisheva, K. (2019). Predictors of Academic Achievement in Blended Learning: The Case of Data Science Minor. International Journal of Emerging Technologies in Learning, 14(5), 64-74. doi:10.3991/ijet.v14i05.9512

O'Reilly, U.-M., & Veeramachaneni, K. (2014). Technology for mining the big data of moocs. Research & Practice in Assessment, 9, 29-37.

Parks, D. M. D. (2017). Defining Data Science and Data Scientist: University of South Florida.

Picciano, A. G. (2014). Big data and learning analytics in blended learning environments: Benefits and concerns. IJIMAI, 2(7), 35-43.

Piety, P., Behrens, J., & Pea, R. (2013). Educational data sciences and the need for interpretive skills. American Educational Research Association, 27.

Piety, P. J., Hickey, D. T., & Bishop, M. (2014). Educational data sciences: Framing emergent practices for analytics of learning, organizations, and systems. Paper presented at the Proceedings of the fourth international conference on learning analytics and knowledge.

Provost, F., & Fawcett, T. (2013). Data science and its relationship to big data and data-driven decision making. Big data, 1(1), 51-59.

Rodríguez-Triana, M. J., Prieto, L. P., Martínez-Monés, A., Asensio-Pérez, J. I., & Dimitriadis, Y. (2018). The teacher in the loop: Customizing multimodal learning analytics for blended learning. Paper presented at the Proceedings of the 8th international conference on learning analytics and knowledge.

Romero, C. c. u. e., & Ventura, S. (2017). Educational data science in massive open online courses. WIREs: Data Mining & Knowledge Discovery, 7(1), n/a-N.PAG. doi:10.1002/widm.1187

Saltz, J. S., & Stanton, J. M. (2017). An introduction to data science. SAGE Publications.

Schleder, G. R., Padilha, A. C., Acosta, C. M., Costa, M., & Fazzio, A. (2019). From DFT to machine learning: recent approaches to materials science–a review. Journal of Physics: Materials, 2(3), 032001.

Scott, J., & Nichols, T. P. (2017). Learning analytics as assemblage: Criticality and contingency in online education. Research in Education, 98(1), 83-105.

Siemens, G. (2013). Learning analytics: The emergence of a discipline. American Behavioral Scientist, 57(10), 1380-1400.

Xu, N., & Ruan, B. (2018). An application of big data learning analysis based on MOOC platform. Paper presented at the 2018 9th International Conference on Information Technology in Medicine and Education (ITME).