An application of item response theory to language testing: Model-data fit studies
Choi, Inn-Chull
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/20547
Description
Title
An application of item response theory to language testing: Model-data fit studies
Author(s)
Choi, Inn-Chull
Issue Date
1989
Doctoral Committee Chair(s)
Cziko, Gary A.
Department of Study
Education
Discipline
Education
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Education, Tests and Measurements
Language
eng
Abstract
Even though the application of IRT to language testing has recently attracted much attention, no model-data fit research has been conducted to explore the appropriateness of IRT modeling in language testing. The tenability of the strong assumption of unidimensionality has not been studied systematically, and little is known concerning the effects of departure from unidimensionality on the estimation of parameters and on model fit. Furthermore, no study has examined the adequacy of the Rasch model which has been predominant in language testing.
The present study investigated the dimensionality of the reading and vocabulary sections of two widely-used English as a foreign language proficiency tests, the University of Cambridge First Certificate of English (FCE) and the Test of English as a Foreign Language (TOEFL). It also compared the relative model fit of three IRT models: 1, 2, and 3 parameter model. Dimensionality of the tests was investigated using Stout's method, factor analyses, and Bejar's method. Secondly, employing fit statistics, invariance check, and the residual analyses, the current study investigated the adequacy of the Rasch model, and the effects of multidimensionality on parameter estimation and model fit.
The results of this study suggest the following: (1) Even the TOEFL reading subtest, developed using the three-parameter IRT model, was multidimensional. This appears to be due to underlying factors associated with the reading passages. (2) The FCE reading and vocabulary subtest, based on the traditional British examination system, was found to be essentially unidimensional. (3) Bejar's approach to checking dimensionality appears to be inadequate in that the results differ across the 1, 2, and 3 parameter models. (4) The finding that the Rasch model clearly fails to provide an adequate fit for these data suggests that the prevailing use of the Rasch model in language testing needs to be re-evaluated. (5) The 3 parameter model fit the data only marginally better than did the 2 parameter model. This suggests that for language tests, the discrimination parameter is more significant than is the guessing parameter. (6) A moderate departure from unidimensionality does not appear to invalidate IRT modeling with the data. This finding suggests the possibility of more justified implementation of IRT modeling in language testing.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.