The estimation accuracy comparison of different types of rating designs under many-facet Rasch model
CHEN Qinglin1,2, YAN Desheng3, LI Guangming1
1 School of Psychology, Center for Studies of Psychological Application, South China Normal University,Guangzhou 510631; 2 Mingde Primary School of Baiyun District, Guangzhou 510407; 3 Inner Mongolia Minzu Preschool Education college, Ordos 017000
陈清琳, 闫德胜, 黎光明. 多面Rasch模型不同研究设计估计精度比较[J]. 心理研究, 2025, 18(5): 413-418.
CHEN Qinglin, YAN Desheng, LI Guangming. The estimation accuracy comparison of different types of rating designs under many-facet Rasch model. Psychological Research, 2025, 18(5): 413-418.
[1] 姚若松, 赵葆楠, 刘泽, 苗群鹰. (2013). 无领导小组讨论的多侧面Rasch模型应用.心理学报, (09), 1039-1049. [2] Fishman, G. S. (1972). Bias considerations in simulation experiments.Operations Research, 20(4), 785-790. [3] He T. H., Gou W. J., Chien Y. C., Chen I. S., & Chang S. M. (2013). Multi-faceted Rasch measurement and bias patterns in EFL writing performance assessment.Psychological Reports, 112(2), 469-485. [4] Hombo C. M., Donoghue J. R., & Thayer D. T. (2001). A simulation study of the effect of rater designs on ability estimation.Educational Testing Service Research Report Series, 2001(1), 1-41. [5] Ilhan, M. N. (2016). A comparison of the results of many-facet Rasch analyses based on crossed and judge pair designs.Educational Sciences Theory and Practice, 2016(2), 579-601. [6] Linacre J. M.(1989). Many-facet Rasch measurement. Chicago, IL: MESA Press. [7] Lunz, M. E., & Stahl, J. A. (1993). The effect of rater severity on person ability measure: A Rasch model analysis.The American Journal of Occupational Therapy, 47(4), 311-317. [8] Putka D. J., Le H., McCloy R. A., & Diaz T. (2008). Ill-structured measurement designs in organizational research: Implications for estimating interrater reliability.Journal of Applied Psychology, 93(5), 959-981. [9] Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment.Language Testing, 25(4), 465-493. [10] Schumacker R. E. (1999). Many-facet Rasch analysis with crossed, nested, and mixed designs.Journal of Outcome Measurement, 3(4), 323-338. [11] Sinharay, S., & Holland, P. W. (2007). Is it necessary to make anchor tests miniversions of the tests being equated or can some restrictions be relaxed? Journal of Educational Measurement, 44(3), 249-275. [12] Wang, W. C., & Qiu, X. L. (2013). A multidimensional and multilevel extension of a random-effect approach to subjective judgment in rating scales.Multivariate Behavioral Research, 48(3), 398-427. [13] Wang, Z. & Yao, L. (2013). The effects of rater severity and rater distribution on examinees' ability estimation for constructed-response items.ETS Research Report Series, 2013(2), 1-22. [14] Wind, S. A., & Jones, E. (2019). The effects of incomplete rating designs in combination with rater effects.Journal of Educational Measurement, 56(1), 76-100. [15] Winke P., Gass S., & Myford C. (2013). Raters' L2 background as a potential source of bias in rating oral performance.Language Testing, 30(2), 231-252.