The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing

Yıl: 2023 Cilt: 14 Sayı: 1 Sayfa Aralığı: 33 - 46 Metin Dili: İngilizce DOI: 10.21031/epod.1140757 İndeks Tarihi: 18-05-2023

The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing

Öz:
The purpose of this study is to examine the effect of different item selection methods on test information function (TIF) and test efficiency in computer adaptive testing (CAT). TIF indicates the quantity of information the test has produced. Test efficiency resembles the amount of information from each item, and more efficient tests are produced from the smallest number of good-quality items. The study was conducted with simulated data, and the constants of the study are sample size, ability parameter distribution, item pool size, model of item response theory (IRT) and distribution of item parameters, ability estimation method, starting rule, item exposure control and stopping rule. The item selection methods, which are the independent variables of this study, are the interval information criterion, efficiency balanced information, matching -b value, Kullback-Leibler information, maximum fisher information, likelihood-weighted information, and random selection. In the comparison of these methods, the best performance in the aspect of TIF is provided by the maximum fisher information method. In terms of test efficiency, the performances of the methods were similar, except for the random selection method, which had the worst performance in terms of both TIF and test efficiency.
Anahtar Kelime: Computer adaptive testing test information function test efficiency

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
0
0
0
  • Babcock, B. & Albano, A. D. (2012). Rasch scale stability in the presence of item parameter and trait drift. Applied Psychological Measurement, 36(7), 565- 580. https://doi.org/10.1177/0146621612455090
  • Babcock, B. & Weiss, D.J. (2012). Termination criteria in computerized adaptive tests: Do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1-18. http://dx.doi.org/10.7333%2Fjcat.v1i1.16
  • Baker, F. (1986). The basics of item response theory. Journal of Educational Measurement, 23(3), 267-270.
  • Baker, F. B. (1992). Item response theory: Parameter estimation techniques. Marcel Dekker.
  • Balta, E., & Uçar, A. (2022). Investigation of measurement precision and test length in computerized adaptive testing under different conditions, E-International Journal of Educational Research, 13(1), 51-68. https://doi.org/10.19160/e-ijer.1023098
  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord and M. R. Novick (Eds.), Statistical theories of mental test scores (chaps. 17–20). AddisonWesley.
  • Blais, J. & Raiche, G. (2010). Features of the sampling distribution of the ability estimate in Computerized Adaptive Testing according to two stopping rules. Journal of applied measurement, 11(4), 424-31.
  • Bock, R. D. & Aitkin, M. (1981).Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459. https://link.springer.com/article/10.1007/BF02293801
  • Bock, R. D. & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431– 444. https://doi.org/10.1177/014662168200600405
  • Boyd, M. A. (2003). Strategies for controlling testlet exposure rates in computerized adaptive testing systems [Unpublished Doctoral Thesis]. The University of Texas.
  • Boztunç Öztürk, N. (2014). Bireyselleştirilmiş bilgisayarlı test uygulamalarinda madde kullanım sıklığı kontrol yöntemlerinin incelenmesi [Investigatıon of item exposure control methods in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Boztunç Öztürk, N. & Doğan, N. (2015). Investigating item exposure control methods in computerized adaptive testing. Educational Sciences: Theory and Practice, 15(1), 85-98. https://doi.org/10.12738/estp.2015.1.2593
  • Brown, A. (2018). Item response theory approaches to test scoring and evaluating the score accuracy. In Irwing, P., Booth, T. & Hughes, D. (Eds.), The Wiley Handbook of Psychometric Testing. John Wiley & Sons.
  • Chang, H.-H. & Ying, Z. (1999). a-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211-222. https://doi.org/10.1177/01466219922031338
  • Choi, S. W. & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33(6), 419–440. https://doi.org/10.1177/0146621608327801
  • Cheng, Y., Patton, J.M. & Shao, C. (2015). a-Stratified computerized adaptive testing in the presence of calibration error. Educational and Psychological Measurement, 75(2), 260-283. https://doi.org/10.1177/0013164414530719
  • Costa, D., Karino, C., Moura, F. & Andrade, D. (2009, June). A comparision of three methods of item selection for computerized adaptive testing [Paper Presentation] The meeting of 2009 GMAC Conference on Computerized Adaptive Testing. Retrieved from www.psych.umn.edu/psylabs/CATCentral/
  • Çıkrıkçı-Demirtaşlı, N. (1999). Psikometride yeni ufuklar: Bilgisayar ortamında bireye uyarlanmış test [New horizons in psychometrics: Individualized test in computer environment]. Türk Psikoloji Bülteni, 5(13), 31-36.
  • Davis, L. L. (2002). Strategies for controlling item exposure in computerized adaptive testing with polytomously scored items. [Unpublished Doctoral Dissertation], The University of Texas.
  • Davis, L. L., & Dodd, B. G. (2005). Strategies for controlling item exposure in computerized adaptive testing with partial credit model. Pearson Educational Measurement Research Report 05-01.
  • Deng, H., Ansley, T. & Chang, H. (2010). Stratified and maximum information item selection procedures in computer adaptive testing. Journal of Educational Measurement, 47(2), 202-226. https://www.jstor.org/stable/20778948
  • Doğan, C.D. & Aybek, E.C. (2021). R-Shiny ile psikometri ve istatistik uygulamaları [Psychometric and statistical applications with R-Shiny]. Pegem Akademi.
  • Eggen, T.J.H.M. (2001). Overexposure and underexposure of items in computerized adaptive testing. Measurement and Research Department Reports, 2001-1. Citogroep
  • Eggen, T.H.J.M. (2004). Contributions to the theory and practice of computerized adaptive testing. Print Partners Ipskamp B.V., Citogroup Arnhem.
  • Eggen, T.H.J.M. (2012). Computerized adaptive testing item selection in computerized adaptive learning systems. Psychometrics in Practice at RCEC, 11.
  • Eroğlu, M.G. (2013). Bireyselleştirilmiş bilgisayarlı test uygulamalarında farklı sonlandırma kurallarının ölçme kesinliği ve test uzunluğu açısından karşılaştırılması [Comparison of different test termination rules in terms of measurement precision and test length in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Gershon, R. C. (2005). Computer adaptive testing. Journal of Applied Measurement, 6(1), 109–127.
  • Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. L., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational measurement, 21(4), 347-360. https://www.jstor.org/stable/1434586
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. SAGE Publications.
  • Han, K.T. (2009). Gradual maximum information ratio approach to item selection in computerized adaptive testing. Council Research Reports, Graduate Management Admission.
  • Han, K.T. (2010). Comparison of Non-Fisher Information Item Selection Criteria in Fixed Length Computerized Adaptive Testing [Paper Presentation] The Annual Meeting of the National Council on Measurement in Education, Denver.
  • Han, K. T. (2011). User's Manual: SimulCAT. Graduate Management Admission Council.
  • Han, K.T. (2012). SimulCAT: Windows software for simulating computerized adaptive test administration. Applied Psychological Measurement, 36(1), 64-66.
  • Han, K. (2018). Components of item selection algorithm in computerized adaptive testing. J Educ Eval Health Prof, 15(7). https://doi.org/10.3352/jeehp.2018.15.7
  • Kaptan, F. (1993). Yetenek kestiriminde adaptive (bireyselleştirilmiş) test uygulaması ile geleneksel kağıt-kalem testi uygulamasının karşılaştırılması [Comparison of adaptive (individualized) test application and traditional paper-pencil test application in ability estimation] [Unpublished Doctoral Dissertation]. Hacettepe University
  • Keller, A.L. (2000). Ability estimation procedures in computerized adaptive testing. Technical Report, American Institute of Certified Public Accountants-AICPA Research Concortium-Examination Teams.
  • Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması [Comparison of computerized adaptive testing strategies] [Unpublished Doctoral Dissertation]. Ankara University.
  • Kingsbury, G. G. & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2(4), 359-375. https://doi.org/10.1207/s15324818ame0204_6
  • Lee, H., & Dodd, B. G. (2012). Comparison of exposure controls, item pool characteristics, and population distributions for CAT using the partial credit model. Educational and Psychological Measurement, 72(1), 159-175. https://doi.org/10.1177/0013164411411296
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates Publishers.
  • Moyer, E. L., Galindo, J. L., & Dodd, B. G. (2012). Balancing flexible constraints and measurement precision in computerized adaptive testing. Educational and Psychological Measurement, 72(4). https://doi.org/10.1177/0013164411431838
  • Ranganathan, K. & Foster, I. (2003). Simulation studies of computation and data scheduling algorithms for data grids. Journal of Grid Computing, 1, 53-62. https://doi.org/10.1023/A:1024035627870
  • Risk, N.M. (2010). The impact of item parameter drift in computer adaptive testing (CAT) [Unpublished doctoral dissertation]. University of Illinois.
  • Rudner, L.M. & Guo, F. (2011). Computer adaptive testing for small scale programs and instructional systems. Graduate Management Council (GMAC), 11(01), 6-10.
  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika monograph supplement, 34(4). https://doi.org/10.1002/j.23338504.1968.tb00153.x
  • Sulak, S. (2013). Bireyselleştirilmiş bilgisayarlı test uygulamalarında kullanılan madde seçme yöntemlerinin karşılaştırılması [Comparision of item selection methods in computerized adaptive testing] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Sulak, S. & Kelecioğlu, H. (2019). Investigation of item selection methods according to test termination rules in CAT applications. Journal of Measurement and Evaluation in Education and Psychology, 10(3), 315-326. https://doi.org/10.21031/epod.530528
  • Stahl, J. A. & Muckle, T. (2007, April). Investigating displacement in the Winsteps Rasch calibration application [Paper Presentation] The Annual Meeting of the American Educational Research Association, Chicago, IL.
  • Stocking, M. L. (1992). Controlling item exposure rates in a realistic adaptive testing paradigm. Research Report 93-2, Educational Testing Service.
  • Şahin, A. (2012). Madde tepki kuramında test uzunluğu ve örneklem büyüklüğünün model veri uyumu, madde parametreleri ve standart hata değerlerine etkisinin incelenmesi [An investigation on the effects of test length and sample size in item response theory on model-data fit, item parameters and standard error values] [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Thompson, N. A. & Weiss, D. J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation., 1-9. https://doi.org/10.7275/wqzt-9427
  • Urry, V. (1977). Tailored testing: A successful application of latent trait theory. Journal of Educational Measurement, 14, 181-196. https://www.jstor.org/stable/1434014
  • van der Linden, W. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63(2), 201–216. https://doi.org/10.1007/BF02294775
  • Veerkamp, W. J. J. & Berger, M. P. F. (1997). Some New Item Selection Criteria for Adaptive Testing. Journal of Educational and Behavioral Statistics, 22(2), 203-226. https://doi.org/10.3102/10769986022002203
  • Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12(1), 15–20. http://dx.doi.org/10.1111/j.1745- 3992.1993.tb00519.x
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450. https://doi.org/10.1007/BF02294627
  • Weiss, D. J. (1982). Latent Trait Theory and Adaptive Testing. In David J. Weiss (Ed.). New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 5-7). Academic Press.
  • Weiss, D. J. & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361–-375. https://www.jstor.org/stable/1434587
  • Wen, H., Chang, H. & Hau, K. (2000). Adaption of a-stratified Method in Variable Length Computerized Adaptive Testing. American Educational Research Association Annual Meeting, Seattle.
  • Yi, Q. & Chang, H. (2003). a-Stratified CAT design with content blocking. British Journal of Mathematical and Statistical Psychology, 56, 359–378. https://doi.org/10.1348/000711003770480084
  • Zwinderman, A. H., & van den Wollenberg, A. L. (1990). Robustness of marginal maximum likelihood estimation in the Rasch model. Applied Psychological Measurement, 14(1), 73-81. https://doi.org/10.1177/014662169001400107
APA SAHIN KÜRŞAD M (2023). The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. , 33 - 46. 10.21031/epod.1140757
Chicago SAHIN KÜRŞAD MERVE The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. (2023): 33 - 46. 10.21031/epod.1140757
MLA SAHIN KÜRŞAD MERVE The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. , 2023, ss.33 - 46. 10.21031/epod.1140757
AMA SAHIN KÜRŞAD M The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. . 2023; 33 - 46. 10.21031/epod.1140757
Vancouver SAHIN KÜRŞAD M The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. . 2023; 33 - 46. 10.21031/epod.1140757
IEEE SAHIN KÜRŞAD M "The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing." , ss.33 - 46, 2023. 10.21031/epod.1140757
ISNAD SAHIN KÜRŞAD, MERVE. "The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing". (2023), 33-46. https://doi.org/10.21031/epod.1140757
APA SAHIN KÜRŞAD M (2023). The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 14(1), 33 - 46. 10.21031/epod.1140757
Chicago SAHIN KÜRŞAD MERVE The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 14, no.1 (2023): 33 - 46. 10.21031/epod.1140757
MLA SAHIN KÜRŞAD MERVE The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, vol.14, no.1, 2023, ss.33 - 46. 10.21031/epod.1140757
AMA SAHIN KÜRŞAD M The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2023; 14(1): 33 - 46. 10.21031/epod.1140757
Vancouver SAHIN KÜRŞAD M The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2023; 14(1): 33 - 46. 10.21031/epod.1140757
IEEE SAHIN KÜRŞAD M "The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing." Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 14, ss.33 - 46, 2023. 10.21031/epod.1140757
ISNAD SAHIN KÜRŞAD, MERVE. "The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing". Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 14/1 (2023), 33-46. https://doi.org/10.21031/epod.1140757