DATA-DRIVEN ITEM ANALYSIS: EVALUATING THE QUALITY OF MULTIPLE CHOICE QUESTIONS USING THE CITAS PROGRAM
Keywords:
Multiple Choice Questions, Item Analysis, Item Quality, CITAS ProgramAbstract
This article aims to explain how to analyze the quality of multiple-choice questions based on classical test theory, using the CITAS program, while providing guidelines for interpreting the statistical values obtained from the analysis. The analysis encompasses both item-level indices, including difficulty index, discrimination index, and distractor efficiency, as well as test-level indices, including reliability and standard error of measurement, which are key indicators reflecting the overall quality of the test. CITAS is a tool that operates through Microsoft Excel, facilitating rapid statistical computation, reducing errors from human calculation, and proving suitable for use in classroom assessment or tryouts with small samples. However, the use of analytical software serves merely as a decision-support tool. Teachers and researchers must possess a sound understanding of the principles of item analysis and the proper interpretation of statistical values in order to make appropriate decisions regarding the revision, elimination, or retention of test items. This will lead to the development of high-quality tests grounded in empirical evidence of effectiveness.
References
ศิริชัย กาญจนวาสี. (2556). ทฤษฎีการทดสอบแบบดั้งเดิม (พิมพ์ครั้งที่ 7). กรุงเทพฯ: สำนักพิมพ์จุฬาลงกรณ์มหาวิทยาลัย.
Allen, M. J. & Yen, W. M. (1979). Introduction to Measurement Theory. Monterey, CA: Brooks/Cole Publishing Company.
Coughlin, P. A. & Featherstone, C. R. (2017). How to Write a High Quality Multiple Choice Question (MCQ): A Guide for Clinicians.
European journal of vascular and endovascular surgery: the official journal of the European Society for Vascular Surgery, 54(5), 654–658.
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston.
Fan, X. (1998). Item response theory and classical test theory: An empirical comparison of their item/person statistics. Educational and Psychological Measurement, 58(3), 357–381.
Haladyna, T. M. (2004). Developing and validating multiple-choice test items (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Hambleton, R. K. & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.
Kr-20. (2010). In N. J. Salkind (Ed.), Encyclopedia of research design. (pp. 668-668). Thousand Oaks, CA: SAGE Publications, Inc.
Nitko, A. J. & Brookhart, S. M. (2011). Educational assessment of students (6th ed.). Boston: Pearson Education.
Thompson, N. A. (2016). Classical item and test analysis using CITAS. St. Paul, MN: Assessment Systems Corporation.
_____. (2019). CITAS [Computer software]. St. Paul, MN: Assessment Systems Corporation.


