Changes: Adaptive testing

Latest revision as of 12:04, 4 September 2015

Social Processes: Methodology · Types of test

This article is in need of attention from a psychologist/academic expert on the subject.
Please help recruit one, or improve this page yourself if you are qualified.
This banner appears on articles that are weak and whose contents should be approached with academic caution.

Psychometrics

Testing

Test construction

Test interpretation

This box: view • talk • edit

Adaptive testing or computer-adaptive testing (CAT) is a method for administering tests that dynamically adapts to the examinee's performance level, varying the difficulty of presented items according to the examinees previous answers. For this reason, it has also been called tailored testing.

How CAT works

CAT successively selects questions so as to maximize the precision of the exam based on what is known about the examinee from previous questions.^[1] From the examinee's perspective, the difficulty of the exam seems to tailor itself to their level of ability. For example, if an examinee performs well on an item of intermediate difficulty, he will then be presented with a more difficult question. Or, if he performed poorly, he would be presented with a simpler question. Compared to static multiple choice tests that nearly everyone has experienced, with a fixed set of items administered to all examinees, computer-adaptive tests require fewer test items to arrive at equally accurate scores.^[1] (Of course, there is nothing about the CAT methodology that requires the items to be multiple-choice; but just as most exams are multiple-choice, most CAT exams also use this format.)

The basic computer-adaptive testing method is an iterative algorithm with the following steps:^[2]

The pool of available items is searched for the optimal item, based on the examinee's current ability estimate
The chosen item is presented to the examinee, who then answers it correctly or incorrectly
The ability estimate is updated, based upon all prior answers
Steps 1–3 are repeated until a termination criterion is met

Nothing is known about the examinee prior to the administration of the first item, so the algorithm is generally started by selecting an item of medium, or medium-easy, difficulty as the first item.

As a result of adaptive administration, different examinees receive quite different tests.^[3] The psychometric technology that allows equitable scores to be computed across different sets of items is item response theory (IRT). IRT is also the preferred methodology for selecting optimal items which are typically selected on the basis of information rather than difficulty, per se.^[2]

The GRE General Test and the Graduate Management Admissions Test are currently primarily administered as a computer-adaptive test. A list of active CAT programs is found at CAT Central, along with a list of current CAT research programs and a near-inclusive bibliography of all published CAT research.

A related methodology called multistage testing (MST) or CAST is used in the Uniform Certified Public Accountant Examination. MST avoids or reduces some of the disadvantages of CAT as described below. See the 2006 special issue of Applied Measurement in Education for more information on MST.

Advantages

Adaptive tests can provide uniformly precise scores for most test-takers.^[2] In contrast, standard fixed tests almost always provide the best precision for test-takers of medium ability and increasingly poorer precision for test-takers with more extreme test scores.

An adaptive test can typically be shortened by 50% and still maintain a higher level of precision than a fixed version.^[1] This translates into a time savings for the test-taker. Test-takers do not waste their time attempting items that are too hard or trivially easy. Additionally, the testing organization benefits from the time savings; the cost of examinee seat time is substantially reduced. However, because the development of a CAT involves much more expense that a standard fixed-form test, a large population is necessary for a CAT testing program to be financially fruitful.

Like any computer-based test, adaptive tests may show results immediately after testing.

Adaptive testing, depending on the item selection algorithm, may reduce exposure of some items because examinees typically receive different sets of items rather than the whole population being administered a single set. However, it may increase the exposure of others (namely the medium or medium/easy items presented to most examinees at the beginning of the test).^[2]

Disadvantages

The first issue encountered in CAT is the calibration of the item pool. In order to model the characteristics of the items (e.g., to pick the optimal item), all the items of the test must be pre-administered to a sizable sample and then analyzed. To achieve this, new items must be mixed into the operational items of an exam (the responses are recorded but do not contribute to the test-takers' scores), called "pilot testing," "pre-testing," or "seeding."^[2] This presents logistical, ethical, and security issues. For example, it is impossible to field an operational adaptive test with brand-new, unseen items;^[4] all items must be pretested with a large enough sample to obtain stable item statistics. This sample may be required to be as large as 1,000 examinees.^[4] Each program must decide what percentage of the test can reasonably be composed of unscored pilot test items.

Although adaptive tests have exposure control algorithms to prevent overuse of a few items,^[2] the exposure conditioned upon ability is often not controlled and can easily become close to 1. That is, it is common for some items to become very common on tests for people of the same ability. This is a serious security concern because groups sharing items may well have a similar functional ability level. In fact, a completely randomized exam is the most secure (but also least efficient).

Review of past items is generally disallowed. Adaptive tests tend to administer easier items after a person answers incorrectly. Supposedly, an astute test-taker could use such clues to detect incorrect answers and correct them. Or, test-takers could be coached to deliberately pick wrong answers, leading to an increasingly easier test. After tricking the adaptive test into building a maximally easy exam, they could then review the items and answer them correctly--possibly achieving a very high score. Test-takers frequently complain about the inability to review.[1]

CAT Components

There are five technical components in building a CAT (the following is adapted from Weiss & Kingsbury, 1984^[1] ). This list does not include practical issues, such as item pretesting or live field release.

Calibrated item pool
Starting point or entry level
Item selection algorithm
Scoring procedure
Termination criterion

Calibrated Item Pool

A pool of items must be available for the CAT to choose from.^[1] The pool must be calibrated with a psychometric model, which is used as a basis for the remaining four components. Typically, item response theory is employed as the psychometric model.^[1] One reason item response theory is popular is because it places persons and items on the same metric (denoted by the Greek letter theta), which is helpful for issues in item selection (see below).

Starting Point

In CAT, items are selected based on the examinee's performance up to a given point in the test. However, the CAT is obviously not able to make any specific estimate of examinee ability when no items have been administered. So some other initial estimate of examinee ability is necessary. If some previous information regarding the examinee is known, it can be used,^[1] but often the CAT just assumes that the examinee is of average ability - hence the first item often being of medium difficulty.

Item Selection Algorithm

As mentioned previously, item response theory places examinees and items on the same metric. Therefore, if the CAT has an estimate of examinee ability, it is able to select an item that is most appropriate for that estimate.^[4] Technically, this is done by selecting the item with the greatest information at that point.^[1] Information is a function of the discrimination parameter of the item, as well as the conditional variance and pseudoguessing parameter (if used).

Scoring Procedure

After an item is administered, the CAT updates its estimate of the examinee's ability level. If the examinee answered the item correctly, the CAT will likely estimate their ability to be somewhat higher, and vice versa. This is done by using the item response function from item response theory to obtain a likelihood function of the examinee's ability. Two methods for this are called maximum likelihood estimation and Bayesian estimation. The latter assumes an a priori distribution of examinee ability, and has two commonly used estimators: expectation a posteriori and maximum a posteriori. Maximum likelihood is equivalent to a Bayes maximum a posterior estimate if a uniform (f(x)=1) prior is assumed.^[4] Maximum likelihood is asymptotically unbiased, but cannot provide a theta estimate for a nonmixed (all correct or incorrect) response vector, in which case a Bayesian method may have to be used temporarily.^[1]

Termination Criterion

The CAT algorithm is designed to repeatedly administer items and update the estimate of examinee ability. This will continue until the item pool is exhausted unless a termination criterion is incorporated into the CAT. Often, the test is terminated when the examinee's standard error of measurement falls below a certain user-specified value, hence the statement above that an advantage is that examinee scores will be uniformly precise or "equiprecise."^[1] Other termination criteria exist for different purposes of the test, such as if the test is designed only to determine if the examinee is should "Pass" or "Fail" the test, rather than obtaining a precise estimate of their ability.^[1] ^[5]

Other Issues

Pass-Fail CAT

In many situations, the purpose of the test is to classify examinees into two or more mutually exclusive and exhaustive categories. This includes the common "mastery test" where the two classifications are "Pass" and "Fail," but also includes situations where there are three or more classifications, such as "Insufficient," "Basic," and "Advanced" levels of knowledge or competency. The kind of "item-level adaptive" CAT described in this article is most appropriate for tests that are not "Pass/Fail." (Or, for Pass/Fail tests where providing good feedback is extremely important.) Some modifications are necessary for a Pass/Fail CAT, also known as a computerized classification test (CCT).^[5] For examinees with true scores very close to the passing score, computerized classification tests will result in long tests while those with true scores far above or below the passing score will have shortest exams.

For example, a new termination criterion and scoring algorithm must be applied that classifies the examinee into a category rather than providing a point estimate of ability. There are two primary methodologies available for this. The more prominent of the two is the sequential probability ratio test (SPRT).^[6]^[7] This formulates the examinee classification problem as a hypothesis test that the examinee's ability is equal to either some specified point above the cutscore or another specified point below the cutscore. Note that this is a point hypthesis formulation rather than a composite hypothesis formulation^[8] that is more conceptually appropriate. A composite hypothesis formulation would be that the examinee's ability is in the region above the cutscore or the region below the cutscore.

A confidence interval approach is also used, where after each item is administered, the algorithm determines the probability that the examinee's true-score is above or below the passing score^[9]^[10] . For example, the algorithm may continue until the 95% confidence interval for the true score no longer contains the passing score. At that point, no further items are needed because the pass-fail decision is already 95% accurate (assuming that the psychometric models underlying the adaptive testing fit the examinee and test).

As a practical matter, the algorithm is generally programmed to have a minimum and a maximum test length (or a minimum and maximum administration time). This approach was originally called "adaptive mastery testing"^[9] but it can be applied to non-adaptive item selection and classification situations of two or more cutscores (the typical mastery test has a single cutscore).^[10]

The item selection algorithm utilized depends on the termination criterion. Maximizing information at the cutscore is more appropriate for the SPRT because it maximizes the difference in the probabilities used in the likelihood ratio.^[11] Maximizing information at the ability estimate is more appropriate for the confidence interval approach because it minimizes the conditional standard error of measurement, which decreases the width of the confidence interval needed to make a classification.^[10]

Practical Constraints of Adaptivity

ETS researcher Martha Stocking has quipped that most adaptive tests are actually barely adaptive tests (BAT's) because, in practice, many constraints are imposed upon item choice. For example, CAT exams must usually meet content specifications^[2]; a verbal exam may need to be composed of equal numbers of analogies, fill-in-the-blank and synonym item types. CATs typically have some form of item exposure constraints,^[2] to prevent the most informative items from being over-exposed. Also, on some tests, an attempt is made to balance surface characteristics of the items such as gender of the people in the items or the ethnicities implied by their names. Thus CAT exams are frequently constrained in which items it may choose and for some exams the constraints may be substantial and require complex search strategies (e.g., linear programming) to find suitable items.

A simple method for controlling item exposure is the "randomesque" or strata method. Rather than selecting the most informative item at each point in the test, the algorithm randomly selects the next item from the next five or ten most informative items. This can be used throughout the test, or only at the beginning.^[2] Another method is the Sympson-Hetter method^[12] , in which a random number is drawn from U(0,1), and compared to a k_i parameter determined for each item by the test user. If the random number is greater than k_i, the next most informative item is considered.^[2]

Wim van der Linden and his coauthors^[13] have advanced an alternative approach called shadow testing which involves creating entire shadow tests as part of selecting items. Selecting items from shadow tests helps adaptive tests meet selection criteria by focusing on globally optimal choices (as opposed to choices that are optimal for a given item).

References

↑ ^1.00 ^1.01 ^1.02 ^1.03 ^1.04 ^1.05 ^1.06 ^1.07 ^1.08 ^1.09 ^1.10 Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.
↑ ^2.0 ^2.1 ^2.2 ^2.3 ^2.4 ^2.5 ^2.6 ^2.7 ^2.8 ^2.9 Thissen, D., & Mislevy, R.J. (2000). Testing Algorithms. In Wainer, H. (Ed.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates.
↑ Green, B.F. (2000). System design and operation. In Wainer, H. (Ed.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates.
↑ ^4.0 ^4.1 ^4.2 ^4.3 Wainer, H., & Mislevy, R.J. (2000). Item response theory, calibration, and estimation. In Wainer, H. (Ed.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates. Cite error: Invalid <ref> tag; name "WainerMislevy" defined multiple times with different content
↑ ^5.0 ^5.1 Lin, C.-J. & Spray, J.A. (2000). Effects of item-selection criteria on classification testing with the sequential probability ratio test. (Research Report 2000-8). Iowa City, IA: ACT, Inc.
↑ Wald, A. (1947). Sequential analysis. New York: Wiley.
↑ Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). New York: Academic Press.
↑ Weitzman, R. A. (1982). Sequential testing for selection. Applied Psychological Measurement, 6, 337-351.
↑ ^9.0 ^9.1 Kingsbury, G.G., & Weiss, D.J. (1983). A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). New York: Academic Press.
↑ ^10.0 ^10.1 ^10.2 Eggen, T. J. H. M, & Straetmans, G. J. J. M. (2000). Computerized adaptive testing for classifying examinees into three categories. Educational and Psychological Measurement, 60, 713-734.
↑ Spray, J. A., & Reckase, M. D. (1994). The selection of test items for decision making with a computerized adaptive test. Paper presented at the Annual Meeting of the National Council for Measurement in Education (New Orleans, LA, April 5-7, 1994).
↑ Sympson, B.J., & Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. Paper presented at the annual conference of the Military Testing Association, San Diego.
↑ For example: van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29, 273‑291.

Additional sources

Books

Drasgow, F., & Olson-Buchanan, J. B. (Eds.). (1999). Innovations in computerized assessment. Hillsdale, NJ: Erlbaum.
Van der Linden, W. J., & Glas, C.A.W. (Eds.). (2000). Computerized adaptive testing: Theory and practice. Boston, MA: Kluwer.
Wainer, H. (Ed.). (2000). Computerized adaptive testing: A Primer (2nd Edition). Mahwah, NJ: ELawrence Erlbaum Associates.
Weiss, D.J. (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). New York: Academic Press.

Papers

Alfonseca, E., Rodriguez, P., & Perez, D. (2007). An approach for automatic generation of adaptive hypermedia in education with multilingual knowledge discovery techniques: Computers & Education Vol 49(2) Sep 2007, 495-513.
Alkhadher, O., Clarke, D. D., & Anderson, N. (1998). Equivalence and predictive validity of paper-and-pencil and computerized adaptive formats of the Differential Aptitude Tests: Journal of Occupational and Organizational Psychology Vol 71(3) Sep 1998, 205-217.
Almond, R. G., & Mislevy, R. J. (1999). Graphical models and computerized adaptive testing: Applied Psychological Measurement Vol 23(3) Sep 1999, 223-237.
Ariel, A., Veldkamp, B. P., & van der Linden, W. J. (2004). Constructing Rotating Item Pools for Constrained Adaptive Testing: Journal of Educational Measurement Vol 41(4) Win 2004, 345-359.
Armstrong, R. D., Jones, D. H., Koppel, N. B., & Pashley, P. J. (2004). Computerized Adaptive Testing With Multiple-Form Structures: Applied Psychological Measurement Vol 28(3) May 2004, 147-164.
Baek, S.-G. (1995). Computerized adaptive attitude testing using the partial credit model. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Balluerka, N., Gorostiaga, A., Alonso-Arbiol, I., & Haranburu, M. (2007). Test adaptation to other cultures: A practical approach: Psicothema Vol 19(1) Feb 2007, 124-133.
Ban, J.-C., Hanson, B. A., Wang, T., Yi, Q., & Harris, D. J. (2001). A comparative study of on-line pretest item--calibration/scaling methods in computerized adaptive testing: Journal of Educational Measurement Vol 38(3) Fal 2001, 191-212.
Barrada, J. R., Mazuela, P., & Olea, J. (2006). Metodo de estratificacion por maxima informacion para el control de la exposicion en tests adaptativos informatizados: Psicothema Vol 18(1) Feb 2006, 156-159.
Barrada, J. R., Olea, J., & Ponsoda, V. (2007). Methods for Restricting Maximum Exposure Rate in Computerized Adaptative Testing: Methodology: European Journal of Research Methods for the Behavioral and Social Sciences Vol 3(1) 2007, 14-23.
Beckmann, J. F., Guthke, J., & Vahle, H. (1997). Analysis of item response latencies in computer-aided adaptive intelligence learning ability tests: Diagnostica Vol 43(1) 1997, 40-62.
Bennett, R. E., Steffen, M., Singley, M. K., Morley, M., & et al. (1997). Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests: Journal of Educational Measurement Vol 34(2) Sum 1997, 162-179.
Bergstrom, B. A., & Lunz, M. E. (1992). Confidence in pass/fail decisions for computer adaptive and paper and pencil examinations: Evaluation & the Health Professions Vol 15(4) Dec 1992, 453-464.
Bergstrom, B. A., & Lunz, M. E. (1999). CAT for certification and licensure. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Bergstrom, B. A., Lunz, M. E., & Gershon, R. C. (1992). Altering the level of difficulty in computer adaptive testing: Applied Measurement in Education Vol 5(2) 1992, 137-149.
Bernstein, J., & Barbier, I. (2000). Design and development parameters for a rapid automatic screening test for prospective simultaneous interpreters: Interpreting Vol 5(2) 2000-2001, 221-238.
Blanchard, G., Lugosi, G., Vayatis, N., Graepel, T., & Herbrich, R. (2004). On the Rate of Convergence of Regularized Boosting Classifiers: Journal of Machine Learning Research Vol 4(5) Jul 2004, 861-894.
Bloxom, B. (1989). Adaptive testing: A review of recent results: Zeitschrift fur Differentielle und Diagnostische Psychologie Vol 10(1) 1989, 1-17.
Bode, R. K., Lai, J.-s., Dineen, K., Heinemann, A. W., Shevrin, D., Von Roenn, J., et al. (2006). Expansion of a physical function item bank and development of an abbreviated form for clinical research: Journal of Applied Measurement Vol 7(1) 2006, 1-15.
Bowers, D. R. (1991). Computer-based adaptive testing in music research and instruction: Psychomusicology Vol 10(1) Spr 1991, 49-63.
Bowles, R. P. (2004). The Effect of Dropping Low Scores on Ability Estimates: Journal of Applied Measurement Vol 5(2) 2004, 178-188.
Boyd, A. M. (2004). Strategies for controlling testlet exposure rates in computerized adaptive testing systems. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Bradlow, E. T., & Weiss, R. E. (2001). Outlier measures and norming methods for computerized adaptive tests: Journal of Educational and Behavioral Statistics Vol 26(1) Spr 2001, 85-104.
Bringsjord, E. L. (2001). Computerized-adaptive versus paper-and-pencil testing environments: An experimental analysis of examinee experience. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Bryant, D. U. (2005). A note on item information in any direction for the multidimensional three-parameter logistic model. Psychometrika, 70(1), 213 – 216.
Bryant, D. U., & Davis, L. (2011). Item vector plots for the multidimensional three-parameter logistic model. Applied Psychological Measurement, 35(5), 393 - 397.
Buyske, S. G. (1999). Optimal design for item calibration in computerized adaptive testing. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Byrne, B. M. (2001). Structural Equation Modeling With AMOS, EQS, and LISREL: Comparative Approaches to Testing for the Factorial Validity of a Measuring Instrument: International Journal of Testing Vol 1(1) 2001, 55-86.
Candell, G. L. (1989). Application of appropriateness measurement to a problem in adaptive testing: Dissertation Abstracts International.
Chang, H.-H., Qian, J., & Ying, Z. (2001). a-stratified multistage computerized adaptive testing with b blocking: Applied Psychological Measurement Vol 25(4) Dec 2001, 333-341.
Chang, H.-H., & van der Linden, W. J. (2003). Optimal stratification of item pools in alpha -stratified computerized adaptive testing: Applied Psychological Measurement Vol 27(4) Jul 2003, 262-274.
Chang, H.-H., & Ying, Z. (1996). A global information approach to computerized adaptive testing: Applied Psychological Measurement Vol 20(3) Sep 1996, 213-229.
Chang, H.-H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing: Applied Psychological Measurement Vol 23(3) Sep 1999, 211-222.
Chang, H.-H., & Zhang, J. (2002). Hypergeometric family and item overlap rates in computerized adaptive testing: Psychometrika Vol 67(3) Sep 2002, 387-398.
Chang, S.-h. (1991). Inter-subtest branching in computerized adaptive testing: Dissertation Abstracts International.
Chang, S.-R. (2007). Computerized adaptive test item response times for correct and incorrect pretest and operational items: Testing fairness and test-taking strategies. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Chang, S.-W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in computerized adaptive testing: Journal of Educational Measurement Vol 40(1) Spr 2003, 71-103.
Chang, Y.-C. I. (2005). Application of Sequential Interval Estimation to Adaptive Mastery Testing: Psychometrika Vol 70(4) Dec 2005, 685-713.
Chen, S.-K. (1997). A comparison of maximum likelihood estimation and expected a posteriori estimation in computerized adaptive testing using the generalized partial credit model. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Chen, S.-K., Hou, L., & Dodd, B. G. (1998). A comparison of maximum likelihood estimation and expected a posteriori estimation in CAT using the partial credit model: Educational and Psychological Measurement Vol 58(4) Aug 1998, 569-595.
Chen, S.-K., Hou, L., Fitzpatrick, S. J., & Dodd, B. G. (1997). The effect of population distribution and method of theta estimation on computerized adaptive testing (CAT) using the rating scale model: Educational and Psychological Measurement Vol 57(3) Jun 1997, 422-439.
Chen, S.-Y., Ankemann, R. D., & Spray, J. A. (2003). The relationship between item exposure and test overlap in computerized adaptive testing: Journal of Educational Measurement Vol 40(2) Sum 2003, 129-145.
Chen, S.-Y., Ankenmann, R. D., & Chang, H.-H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing: Applied Psychological Measurement Vol 24(3) Sep 2000, 241-255.
Chen, S.-Y., & Lei, P.-W. (2005). Controlling Item Exposure and Test Overlap in Computerized Adaptive Testing: Applied Psychological Measurement Vol 29(3) May 2005, 204-217.
Cheng, P. E., & Liou, M. (2000). Estimation of trait level in computerized adaptive testing: Applied Psychological Measurement Vol 24(3) Sep 2000, 257-265.
Cheng, P. E., & Liou, M. (2003). Computerized adaptive testing using the nearest-neighbors criterion: Applied Psychological Measurement Vol 27(3) May 2003, 204-216.
Cheng, Y., Chang, H.-H., & Yi, Q. (2007). Two-phase item selection procedure for flexible content balancing in CAT: Applied Psychological Measurement Vol 31(6) Nov 2007, 467-482.
Chuah, S. C., Drasgow, F., & Luecht, R. (2006). How Big Is Big Enough? Sample Size Requirements for CAST Item Parameter Estimation: Applied Measurement in Education Vol 19(3) 2006, 241-255.
Cole, N. S., & Zieky, M. J. (2001). The new faces of fairness: Journal of Educational Measurement Vol 38(4) Win 2001, 369-382.
Cudeck, R. (1985). A structural comparison of conventional and adaptive versions of the ASVAB: Multivariate Behavioral Research Vol 20(3) Jul 1985, 305-322.
Cusick, G. M. (1989). Computer-assisted vocational assessment: Vocational Evaluation & Work Adjustment Bulletin Vol 22(1) Spr 1989, 19-23.
Daffinrud, S. (2007). Comparing the relative measurement efficiency of dichotomous and polytomous models in linear and adaptive testing conditions. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Davey, T., & Pitoniak, M. J. (2006). Designing Computerized Adaptive Tests. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Davis, L. L. (2003). Strategies for controlling item exposure in computerized adaptive testing with polytomously scored items. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Davis, L. L. (2004). Strategies for Controlling Item Exposure in Computerized Adaptive Testing With the Generalized Partial Credit Model: Applied Psychological Measurement Vol 28(3) May 2004, 165-185.
Davis, L. L., & Dodd, B. G. (2003). Item Exposure Constraints for Testlets in the Verbal Reasoning Section of the MCAT: Applied Psychological Measurement Vol 27(5) Sep 2003, 335-356.
Davis, L. L., Pastor, D. A., Dodd, B. G., Chiang, C., & Fitzpatrick, S. J. (2003). An examination of exposure control and content balancing restrictions on item selection in CATs using the partial credit model: Journal of Applied Measurement Vol 4(1) 2003, 24-42.
de Ayala, R., & Koch, W. R. (1985). ALPHATAB: A lookup table for Bayesian computerized adaptive testing: Applied Psychological Measurement Vol 9(3) Sep 1985, 326.
de Ayala, R. J. (1988). Computerized adaptive testing: A comparison of the nominal response model and the three parameter model: Dissertation Abstracts International.
de Ayala, R. J. (1989). A comparison of the nominal response model and the three-parameter logistic model in computerized adaptive testing: Educational and Psychological Measurement Vol 49(4) Win 1989, 789-805.
de Ayala, R. J. (1992). The influence of dimensionality on cat ability estimation: Educational and Psychological Measurement Vol 52(3) Fal 1992, 513-528.
de Ayala, R. J. (1992). The nominal response model in computerized adaptive testing: Applied Psychological Measurement Vol 16(4) Dec 1992, 327-343.
de Ayala, R. J., Dodd, B. G., & Koch, W. R. (1990). A simulation and comparison of flexilevel and Bayesian computerized adaptive testing: Journal of Educational Measurement Vol 27(3) Fal 1990, 227-239.
De Ayala, R. J., Dodd, B. G., & Koch, W. R. (1992). A comparison of the partial credit and graded response models in computerized adaptive testing: Applied Measurement in Education Vol 5(1) 1992, 17-34.
De Ayala, R. J., Schafer, W. D., & Sava-Bolesta, M. (1995). An investigation of the standard errors of expected a posteriori ability estimates: British Journal of Mathematical and Statistical Psychology Vol 48(2) Nov 1995, 385-405.
De Beer, M. (2001). The construction and evaluation of a dynamic computerised adaptive test for the measurement of learning potential. Dissertation Abstracts International: Section B: The Sciences and Engineering.
de la Torre Sanchez, R. (1992). The development and evaluation of a system for computerized adaptive testing: Dissertation Abstracts International.
Degraff, A. J. (2006). Monitoring growth in early reading skills: Validation of a Computer Adaptive Test. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Desmarais, M. C., & Pu, X. (2005). A Bayesian Student Model without Hidden Nodes and its Comparison with Item Response Theory: International Journal of Artificial Intelligence in Education Vol 15(4) 2005, 291-323.
Divgi, D. R. (1989). Estimating reliabilities of computerized adaptive tests: Applied Psychological Measurement Vol 13(2) Jun 1989, 145-149.
Dodd, B. G. (1990). The effect of item selection procedure and stepsize on computerized adaptive attitude measurement using the rating scale model: Applied Psychological Measurement Vol 14(4) Dec 1990, 355-366.
Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized adaptive testing with polytomous items: Applied Psychological Measurement Vol 19(1) Mar 1995, 5-22.
Dodd, B. G., Koch, W. R., & de Ayala, R. J. (1989). Operational characteristics of adaptive testing procedures using the graded response model: Applied Psychological Measurement Vol 13(2) Jun 1989, 129-143.
Dodd, B. G., Koch, W. R., & de Ayala, R. J. (1993). Computerized adaptive testing using the partial credit model: Effects of item pool characteristics and different stopping rules: Educational and Psychological Measurement Vol 53(1) Spr 1993, 61-77.
Dorans, N. J. (2000). Scaling and equating. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Drasgow, F. (2002). The work ahead: A psychometric infrastructure for computerized adaptive tests. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Ebmeier, H., & Ng, J. (2005). Development and Field Test of an Employment Selection Instrument for Teachers in Urban School Districts: Journal of Personnel Evaluation in Education Vol 18(3) Sep 2005, 201-218.
Economides, A. A., & Roupas, C. (2007). Evaluation of computer adaptive testing systems: International Journal of Web-Based Learning and Teaching Technologies Vol 2(1) 2007, 70-87.
Eggen, T. J. H. M. (1999). Item selection in adaptive testing with the sequential probability ratio test: Applied Psychological Measurement Vol 23(3) Sep 1999, 249-261.
Eggen, T. J. H. M., & Verschoor, A. J. (2006). Optimal Testing With Easy or Difficult Items in Computerized Adaptive Testing: Applied Psychological Measurement Vol 30(5) Sep 2006, 379-393.
Embretson, S. E. (1992). Computerized adaptive testing: Its potential substantive contributions to psychological research and assessment: Current Directions in Psychological Science Vol 1(4) Aug 1992, 129-131.
Embretson, S. E. (1992). The Quiet Revolution in Test Theory: A Lucid Introduction to IRT? : PsycCRITIQUES Vol 37 (10), Oct, 1992.
Embretson, S. E. (2005). Measuring human intelligence with artificial intelligence: Adaptive item generation. New York, NY: Cambridge University Press.
Embretson, S. E., & Diehl, K. A. (2000). Item response theory: Kazdin, Alan E (Ed).
Fan, M. (1995). Assessment of scaled score consistency in adaptive testing from a multidimensional item response theory perspective. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Feng, X. (2003). Statistical detection and estimation of differential item functioning in computerized adaptive testing. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Ferdous, A. A., Plake, B. S., & Chang, S.-R. (2007). The effect of including pretest items in an operational computerized adaptive test: Do different ability examinees spend different amounts of time on embedded pretest items? : Educational Assessment Vol 12(2) 2007, 161-173.
Flaugher, R. (2000). Item pools. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Folk, V. G., & Green, B. F. (1989). Adaptive estimation when the unidimensionality assumption of IRT is violated: Applied Psychological Measurement Vol 13(4) Dec 1989, 373-389.
Forbey, J. D., & Ben-Porath, Y. S. (2007). Computerized Adaptive Personality Testing: A Review and Illustration With the MMPI-2 Computerized Adaptive Version: Psychological Assessment Vol 19(1) Mar 2007, 14-24.
Forbey, J. D., Handel, R. W., & Ben-Porath, Y. S. (2000). A real-data simulation of computerized adaptive administration of the MMPI-A: Computers in Human Behavior Vol 16(1) Jan 2000, 83-96.
Freeman, F. G., Mikulka, P. J., Scerbo, M. W., & Scott, L. (2004). An evaluation of an adaptive automation system using a cognitive vigilance task: Biological Psychology Vol 67(3) Nov 2004, 283-297.
Frey, A., & Moosbrugger, H. (2004). Avoiding the Confounding of Concentration Performance and Activation by Adaptive Testing with the FACT: Zeitschrift fur Differentielle und Diagnostische Psychologie Vol 25(1) 2004, 1-17.
Frick, T. W. (1990). A comparison of three decision models for adapting the length of computer-based mastery tests: Journal of Educational Computing Research Vol 6(4) 1990, 479-513.
Frick, T. W. (1992). Computerized adaptive mastery tests as expert systems: Journal of Educational Computing Research Vol 8(2) 1992, 187-213.
Frost, A. G. (1988). Adaptive attitude testing for personnel selection using an honesty test: Dissertation Abstracts International.
Garcia, D. A., Santa Cruz, C., Dorronsoro, J. R., & Rubio Franco, V. J. (2000). Item selection algorithms in computerized adaptive testing: Psicothema Vol 12(Suppl2) 2000, 12-14.
Gardner, W., Kelleher, K. J., & Pajer, K. A. (2002). Multidimensional adaptive testing for mental health problems in primary care: Medical Care Vol 40(9) Sep 2002, 812-823.
Garrison, W. M. (1985). Monitoring item calibrations from data yielded by an adaptive testing procedure: Educational Research Quarterly Vol 10(2) 1985-1986, 9-12.
Garrison, W. M., & Baumgarten, B. S. (1986). An application of computer adaptive testing with communication handicapped examinees: Educational and Psychological Measurement Vol 46(1) Spr 1986, 23-35.
Gershon, R. C. (2005). Computer Adaptive Testing: Journal of Applied Measurement Vol 6(1) 2005, 109-127.
Glas, C. A. W., & van der Linden, W. J. (2003). Computerized adaptive testing with item cloning: Applied Psychological Measurement Vol 27(4) Jul 2003, 247-261.
Gorin, J. S., Dodd, B. G., Fitzpatrick, S. J., & Shieh, Y. Y. (2005). Computerized Adaptive Testing With the Partial Credit Model: Estimation Procedures, Population Distributions, and Item Pool Characteristics: Applied Psychological Measurement Vol 29(6) Nov 2005, 433-456.
Green, B. F. (1991). Computer-based adaptive testing in 1991: Psychology & Marketing Vol 8(4) Win 1991, 243-257.
Green, B. F., & et al. (1984). Technical guidelines for assessing computerized adaptive tests: Journal of Educational Measurement Vol 21(4) Win 1984, 347-360.
Greenwood, D., & Taylor, C. (1965). Adaptive testing in an older population: Journal of Psychology: Interdisciplinary and Applied 60(2) 1965, 193-198.
Griffeth, R. W., Gaertner, S., & Sager, J. K. (1999). Taxonomic model of withdrawal behaviors: The adaptive response model: Human Resource Management Review Vol 9(4) Win 1999, 577-590.
Grodenchik, D. J. (2002). The implications of the use of non-optimal items in a Computer Adaptive Testing (CAT) environment. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Gu, L. (2007). Designing optimal item pools for Computerized Adaptive Tests with exposure controls. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Guthke, J., & Beckmann, J. F. (2003). Dynamic assessment with diagnositic problems. Washington, DC: American Psychological Association.
Guthke, J., & Beckmann, J. F. (2003). Dynamic assessment with diagnostic programs. Washington, DC: American Psychological Association.
Guthke, J., Rader, E., Caruso, M., & Schmidt, K.-D. (1991). Development of an adaptive computer-assisted learning test based on structural information theory: Diagnostica Vol 37(1) 1991, 1-28.
Guzman, E., Conejo, R., & Perez-de-la-Cruz, J.-L. (2007). Adaptive testing for hierarchical student models: User Modeling and User-Adapted Interaction Vol 17(1-2) Mar 2007, 119-157.
Halkitis, P. N. (1995). An examination of the precision of measurement of computerized adaptive tests with limited item pools. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Hambleton, R. K., & Xing, D. (2006). Optimal and Nonoptimal Computer-Based Test Designs for Making Pass-Fail Decisions: Applied Measurement in Education Vol 19(3) 2006, 221-239.
Hambleton, R. K., Zaal, J. N., & Pieters, J. P. M. (1991). Computerized adaptive testing: Theory, applications, and standards. New York, NY: Kluwer Academic/Plenum Publishers.
Handel, R. W., Ben-Porath, Y. S., & Watt, M. (1999). Computerized adaptive assessment with the MMPI-2 in a clinical setting: Psychological Assessment Vol 11(3) Sep 1999, 369-380.
Hankins, J. A. (1987). The effects of variable entry on bias and information of the Bayesian adaptive testing procedure: Dissertation Abstracts International.
Hankins, J. A. (1990). The effects of variable entry for a Bayesian adaptive test: Educational and Psychological Measurement Vol 50(4) Win 1990, 785-802.
Hansen, E. G. (1989). Validation of a computerized adaptive test of secondary school biological science: Dissertation Abstracts International.
Hau, K.-T., & Chang, H.-H. (2001). Item selection in computerized adaptive testing: Should more discriminating items be used first? : Journal of Educational Measurement Vol 38(3) Fal 2001, 249-266.
Hausler, J. (2006). Adaptive success control in computerized adaptive testing: Psychology Science Vol 48(4) 2006, 436-450.
Hendrickson, A. (2007). An NCME instructional module on multistage testing: Educational Measurement: Issues and Practice Vol 26(2) Sum 2007, 44-52.
Henly, S. J., Klebe, K. J., McBride, J. R., & Cudeck, R. (1989). Adaptive and conventional versions of the DAT: The first complete test battery comparison: Applied Psychological Measurement Vol 13(4) Dec 1989, 363-371.
Hetter, R. D., Segall, D. O., & Bloxom, B. M. (1994). A comparison of item calibration media in computerized adaptive testing: Applied Psychological Measurement Vol 18(3) Sep 1994, 197-204.
Hetter, R. D., Segall, D. O., & Bloxom, B. M. (1997). Evaluating item calibration medium in computerized adaptive testing. Washington, DC: American Psychological Association.
Hetter, R. D., & Sympson, J. B. (1997). Item exposure control in CAT-ASVAB. Washington, DC: American Psychological Association.
Hockemeyer, C. (2002). A comparison of non-deterministic procedures for the adaptive assessment of knowledge: Psychologische Beitrage Vol 44(4) 2002, 495-503.
Hol, A. M., Vorst, H. C. M., & Mellenbergh, G. J. (2001). Application of a computerised adaptive test procedure on personality data: Nederlands Tijdschrift voor de Psychologie en haar Grensgebieden Vol 56(3) Jun 2001, 119-133.
Hol, A. M., Vorst, H. C. M., & Mellenbergh, G. J. (2007). Computerized adaptive testing for polytomous motivation items: Administration mode effects and a comparison with short forms: Applied Psychological Measurement Vol 31(5) Sep 2007, 412-429.
Hontangas, P., Olea, J., Ponsoda, V., Revuelta, J., & Wise, S. L. (2004). Assisted Self-Adapted Testing: A Comparative Study: European Journal of Psychological Assessment Vol 20(1) 2004, 2-9.
Hornke, L. F. (1993). Potential gains of computer-assisted adaptive testing: Diagnostica Vol 39(2) 1993, 109-119.
Hornke, L. F. (1997). Investigating item response times in computerized adaptive testing: Diagnostica Vol 43(1) 1997, 27-39.
Hornke, L. F. (1999). Benefits from computerized adaptive testing as seen in simulation studies: European Journal of Psychological Assessment Vol 15(2) 1999, 91-98.
Hornke, L. F. (2000). Item response times in computerized adaptive testing: Psicologica Vol 21(1-2) 2000, 175-189.
Hornke, L. F. (2000). Optimization of adaptive testing: Psychologische Beitrage Vol 42(4) 2000, 634-644.
Hornke, L. F. (2001). Number of items in adaptive testing: Zeitschrift fur Differentielle und Diagnostische Psychologie Vol 22(3) 2001, 185-193.
Hsu, T.-c., & Shermis, M. D. (1989). The development and evaluation of a microcomputerized adaptive placement testing system for college mathematics: Journal of Educational Computing Research Vol 5(4) 1989, 473-485.
Hutt, M. L. (1947). A clinical study of "consecutive" and "adaptive" testing with the revised Stanford-Binet: Journal of Consulting Psychology Vol 11(2) Mar 1947, 93-103.
Jacobs-Cassuto, M. S. (2005). A comparison of adaptive mastery testing using testlets with the -parameter logistic model. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Jelinek, M., Kveton, P., & Denglerova, D. (2006). Adaptive testing--Basic concepts and principles: Ceskoslovenska Psychologie Vol 50(2) 2006, 163-173.
Jeng, H.-L. (1993). The effect of adaptive testing item selection methods on the precision of ability estimation and the examinee's motivation: Dissertation Abstracts International.
Jodoin, M. G. (2003). Psychometric properties of several computer-based test designs with ideal and constrained item pools. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Jun, Z., Dongming, O., Shuyuan, X., Haiqi, D., & Shuqing, Q. (2005). Item Characteristic Curve Equating under Graded Response Models in IRT: Acta Psychologica Sinica Vol 37(6) Nov 2005, 832-838.
Kaernbach, C. (1990). A single-interval adjustment-matrix (SIAM) procedure for unbiased adaptive testing: Journal of the Acoustical Society of America Vol 88(6) Dec 1990, 2645-2655.
Kalyuga, S., & Sweller, J. (2005). Rapid Dynamic Assessment of Expertise to Improve the Efficiency of Adaptive E-learning: Educational Technology Research and Development Vol 53(3) 2005, 83-93.
Kim, H.-O. (1994). Monte Carlo simulation comparison of two-stage testing and computerized adaptive testing. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Kingsbury, G. G. (1985). Adaptive self-referenced testing as a procedure for the measurement of individual change due to instruction: A comparison of the reliabilities of change estimates obtained from conventional and adaptive testing procedures: Dissertation Abstracts International.
Kingsbury, G. G. (1990). Adapting adaptive testing: Using the MicroCAT testing system in a local school district: Educational Measurement: Issues and Practice Vol 9(2) Sum 1990, 3-6, 29.
Kingsbury, G. G. (1992). Issues in the Development of Computerized Adaptive Tests: PsycCRITIQUES Vol 37 (6), Jun, 1992.
Kingsbury, G. G., & Houser, R. L. (1993). Assessing the utility of item response models: Computerized adaptive testing: Educational Measurement: Issues and Practice Vol 12(1) Spr 1993, 21-27, 39.
Kingsbury, G. G., & Houser, R. L. (1999). Developing computerized adaptive tests for school children. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Kirisci, L. (1990). A predictive analysis approach to adaptive testing: Dissertation Abstracts International.
Klieger, D. M. (1990). Flexible testing without programming: Behavior Research Methods, Instruments & Computers Vol 22(2) Apr 1990, 138-141.
Kontsevich, L. L., & Tyler, C. W. (1999). Bayesian adaptive estimation of psychometric slope and threshold: Vision Research Vol 39(16) Aug 1999, 2729-2737.
Kubinger, K. D. (1986). Adaptive intelligence testing: Diagnostica Vol 32(4) 1986, 330-344.
Laatsch, L., & Choca, J. (1994). Cluster-branching methodology for adaptive testing and the development of the Adaptive Category Test: Psychological Assessment Vol 6(4) Dec 1994, 345-351.
Laatsch, L. K. (1992). The use of item analysis and adaptive testing to revise the Halstead Category Test: Dissertation Abstracts International.
Lairson, D. R., Newmark, G. R., Rakowski, W., Tiro, J. A., & Vernon, S. W. (2004). Development costs of a computer-generated tailored intervention: Evaluation and Program Planning Vol 27(2) May 2004, 161-169.
Laurier, M. (2004). Assessment and multimedia in learning an L2: ReCALL: Journal of Eurocall Vol 16(2) Nov 2004, 475-487.
Legree, P. J., Fischl, M. A., Gade, P. A., & Wilson, M. (1998). Testing word knowledge by telephone to estimate general cognitive aptitude using an adaptive test: Intelligence Vol 26(2) 1998, 91-98.
Lei, P.-W., Chen, S.-Y., & Yu, L. (2006). Comparing methods of assessing differential item functioning in a computerized adaptive testing environment: Journal of Educational Measurement Vol 44(3) Sep 2006, 245-264.
Leung, C.-K., Chang, H.-H., & Hau, K.-T. (2002). Item selection in computerized adaptive testing: Improving the a-stratified design with the Sympson-Hetter algorithm: Applied Psychological Measurement Vol 26(4) Dec 2002, 376-392.
Leung, C.-K., Chang, H.-H., & Hau, K.-T. (2003). Incorporation of content balancing requirements in stratification designs for computerized adaptive testing: Educational and Psychological Measurement Vol 63(2) Apr 2003, 257-270.
Leung, C.-K., Chang, H.-H., & Hau, K.-T. (2005). Computerized adaptive testing: A mixture item selection approach for constrained situations: British Journal of Mathematical and Statistical Psychology Vol 58(2) Nov 2005, 239-257.
Leutner, D., & Schumacher, G. (1990). The effects of different on-line adaptive response time limits on speed and amount of learning in computer assisted instruction and intelligent tutoring: Computers in Human Behavior Vol 6(1) 1990, 17-29.
Li, Y. H., & Schafer, W. D. (2005). Increasing the Homogeneity of CAT's Item-Exposure Rates by Minimizing or Maximizing Varied Target Functions While Assembling Shadow Tests: Journal of Educational Measurement Vol 42(3) Fal 2005, 245-269.
Li, Y. H., & Schafer, W. D. (2005). Trait Parameter Recovery Using Multidimensional Computerized Adaptive Testing in Reading and Mathematics: Applied Psychological Measurement Vol 29(1) Jan 2005, 3-25.
Lin, S.-h. (1987). Studies on latent trait theory and its application in adaptive testing: Bulletin of Educational Psychology Vol 20 May 1987, 131-182.
Liu, Q., Otter, T., & Allenby, G. M. (2007). Investigating endogeneity bias in marketing: Marketing Science Vol 26(5) Sep-Oct 2007, 642-650.
Luecht, R., Brumfield, T., & Breithaupt, K. (2006). A Testlet Assembly Design for Adaptive Multistage Tests: Applied Measurement in Education Vol 19(3) 2006, 189-202.
Luecht, R. M. (1996). Multidimensional computerized adaptive testing in a certification or licensure context: Applied Psychological Measurement Vol 20(4) Dec 1996, 389-404.
Luecht, R. M., Champlain, A. D., & Nungester, R. J. (1998). Maintaining content validity in computerized adaptive testing: Advances in Health Sciences Education Vol 3(1) 1998, 29-41.
Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer-adaptive sequential testing: Journal of Educational Measurement Vol 35(3) Fal 1998, 229-249.
Lunz, M. E., & Bergstrom, B. A. (1994). An empirical study of computerized adaptive test administration conditions: Journal of Educational Measurement Vol 31(3) Fal 1994, 251-263.
Lunz, M. E., Bergstrom, B. A., & Wright, B. D. (1992). The effect of review on student ability and test efficiency for computerized adaptive tests: Applied Psychological Measurement Vol 16(1) Mar 1992, 33-40.
Macdonald, P. L. (2003). Computer-adaptive test for measuring personality factors using item response theory. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Macready, G. B., & Dayton, C. M. (1992). The application of latent class models in adaptive testing: Psychometrika Vol 57(1) Mar 1992, 71-88.
Marks, M. R. (1953). One- and two-tailed tests: Psychological Review Vol 60(3) May 1953, 207-208.
Martin, C. J., & Hoshaw, C. R. (1997). Policy and program management perspective. Washington, DC: American Psychological Association.
May, K. O. R. (1993). Measuring change conventionally and adaptively: Dissertation Abstracts International.
McBride, J. R. (1997). Dissemination of CAT-ASVAB technology. Washington, DC: American Psychological Association.
McBride, J. R. (1997). The Marine Corps Exploratory Development Project: 1977-1982. Washington, DC: American Psychological Association.
McBride, J. R. (1997). Research antecedents of applied adaptive testing. Washington, DC: American Psychological Association.
McBride, J. R. (1997). Technical perspective. Washington, DC: American Psychological Association.
McBride, J. R., Wetzel, C. D., & Hetter, R. D. (1997). Preliminary psychometric research for CAT-ASVAB: Selecting an adaptive testing strategy. Washington, DC: American Psychological Association.
McLeod, L., Lewis, C., & Thissen, D. (2003). A Bayesian method for the detection of item preknowledge in computerized adaptive testing: Applied Psychological Measurement Vol 27(2) Mar 2003, 121-137.
McLeod, L. D. (1999). Alternative methods for the detection of item preknowledge in computerized adaptive testing. Dissertation Abstracts International: Section B: The Sciences and Engineering.
McLeod, L. D., & Lewis, C. (1999). Detecting item memorization in the CAT environment: Applied Psychological Measurement Vol 23(2) Jun 1999, 147-160.
Mead, A. D. (2006). An Introduction to Multistage Testing: Applied Measurement in Education Vol 19(3) 2006, 185-187.
Meijer, R. R. (2002). Outlier detection in high-stakes certification testing: Journal of Educational Measurement Vol 39(3) Fal 2002, 219-233.
Meijer, R. R., & Gregoire, J. (2001). New developments in the area of computerized testing: Psychologie Francaise Vol 46(3) Sep 2001, 221-230.
Meijer, R. R., & Nering, M. L. (1999). Computerized adaptive testing: Overview and introduction: Applied Psychological Measurement Vol 23(3) Sep 1999, 187-194.
Miceli, R., & Molinengo, G. (2005). Administration of computerized and adaptive tests: An application of the Rasch Model: Testing Psicometria Metodologia Vol 12(3) 2005, 131-149.
Miettinen, M., Nokelainen, P., Kurhila, J., Silander, T., & Tirri, H. (2005). EDUFORM - A Tool for Creating Adaptive Questionnaires: International Journal on E-Learning Vol 4(3) Jul-Sep 2005, 365-373.
Mills, C. N. (1999). Development and introduction of a computer adaptive Graduate Record Examinations General Test. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Mills, C. N., & Stocking, M. (1996). Practical issues in large-scale computerized adaptive testing: Applied Measurement in Education Vol 9(4) 1996, 287-304.
Mislevy, R. J., & Chang, H.-H. (2000). Does adaptive testing violate local independence? : Psychometrika Vol 65(2) Jun 2000, 149-156.
Moreno, K. E. (1997). CAT-ASVAB operational test and evaluation. Washington, DC: American Psychological Association.
Moreno, K. E., & Segall, D. O. (1997). Reliability and construct validity of CAT-ASVAB. Washington, DC: American Psychological Association.
Moreno, K. E., Segall, D. O., & Hetter, R. D. (1997). The use of computerized adaptive testing in the military. Westport, CT: Greenwood Press/Greenwood Publishing Group.
Morey, L. C., Warner, M. B., Shea, M. T., Gunderson, J. G., Sanislow, C. A., Grilo, C., et al. (2003). The Representation of Four Personality Disorders by the Schedule for Nonadaptive and Adaptive Personality Dimensional Model of Personality: Psychological Assessment Vol 15(3) Sep 2003, 326-332.
Moshinsky, A., & Kazin, C. (2005). Constructing a Computerized Adaptive Test for University Applicants With Disabilities: Applied Measurement in Education Vol 18(4) 2005, 381-405.
Nandakumar, R., & Roussos, L. (2004). Evaluation of the CATSIB DIF Procedure in a Pretest Setting: Journal of Educational and Behavioral Statistics Vol 29(2) Sum 2004, 177-199.
Nering, M. L. (1997). The distribution of indexes of person fit within the computerized adaptive testing environment: Applied Psychological Measurement Vol 21(2) Jun 1997, 115-127.
Newman, L. S. (1995). Content validity of a computerized adaptive licensing and certification examination: A comparison of content-balancing methods. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Nicewander, W. A., & Thomasson, G. L. (1999). Some reliability estimates for computerized adaptive tests: Applied Psychological Measurement Vol 23(3) Sep 1999, 239-247.
Oakland, T., & Hatzichristou, C. (2003). Issues to consider when adapting tests: Psychology: The Journal of the Hellenic Psychological Society Vol 10(4) Dec 2003, 437-448.
Olea, J., Ponsoda, V., Revuelta, J., & Belchi, J. (1996). Psychometric properties of an computerized adaptive test for the measurement of english vocabulary: Estudios de Psicologia No 55 1996, 61-73.
Olea, J., Revuelta, J., Ximenez, M. C., & Abad, F. J. (2000). Psychometric and psychological effects of review on computerized fixed and adaptive tests: Psicologica Vol 21(1-2) 2000, 157-173.
O'Neill, T., Lunz, M. E., & Thiede, K. (2000). The impact of receiving the same items on consecutive computer adaptive test administrations: Journal of Applied Measurement Vol 1(2) 2000, 131-151.
Papanastasiou, E. C. (2005). Item Review and the Rearrangement Procedure: Its process and its results: Educational Research and Evaluation Vol 11(4) Aug 2005, 303-321.
Park, H.-S., Pearson, P. D., & Reckase, M. D. (2005). Assessing the Effect of Cohort, Gender and Race on Differential Item Functioning (DIF) In an Adaptive Test Designed for Multi-Age Groups: Reading Psychology Vol 26(1) Jan-Mar 2005, 81-101.
Passos, V. L., Berger, M. P. F., & Tan, F. E. (2007). Test design optimization in CAT early stage with the nominal response model: Applied Psychological Measurement Vol 31(3) May 2007, 213-232.
Pastor, D. A., Dodd, B. G., & Chang, H.-H. (2002). A comparison of item selection techniques and exposure control mechanisms in CATs using the generalized partial credit model: Applied Psychological Measurement Vol 26(2) Jun 2002, 147-163.
Patsula, L. N. (2000). A comparison of computerized adaptive testing and multistage testing. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Penfield, R. D. (2006). Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items: Applied Measurement in Education Vol 19(1) 2006, 1-20.
Penfield, R. D. (2007). Estimating the standard error of the maximum likelihood ability estimator in adaptive testing using the posterior-weighted test information function: Educational and Psychological Measurement Vol 67(6) Dec 2007, 958-975.
Petersen, M. A., Groenvold, M., Aaronson, N., Fayers, P., Sprangers, M., & Bjorner, J. B. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluations: Quality of Life Research: An International Journal of Quality of Life Aspects of Treatment, Care & Rehabilitation Vol 15(3) Apr 2006, 315-329.
Ping, C., Shuliang, D., Haijing, L., & Jie, Z. (2006). Item Selection Strategies of Computerized Adaptive Testing based on Graded Response Model: Acta Psychologica Sinica Vol 38(3) May 2006, 461-467.
Pitkin, A. K., & Vispoel, W. P. (2001). Differences between self-adapted and computerized adaptive tests: A meta-analysis: Journal of Educational Measurement Vol 38(3) Fal 2001, 235-247.
Piton-Gonçalves, Jean, & Aluísio, Sandra Maria. (2015). Multidimensional Computer Adaptive test with educational purposes: principles and methods. Ensaio: Avaliação e Políticas Públicas em Educação, 23(87), 389-414. Retrieved September 04, 2015, from http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0104-40362015000200389&lng=en&tlng=pt. 10.1590/S0104-40362015000100016.
Piton-Gonçalves, J. Aluisio, S. M. Oliveira, L. H. M., Oliveira, O. N. A Learning Environment for English for Academic Purposes Based on Adaptive Tests and Task-Based Systems. Proceedings of the 7th International Conference on Intelligent Tutoring Systems - Lecture Notes in Computer Science, 2004. v. 3220. p. 1-11.
Piton-Goncalves, J. Aluisio, S. M. (2012) An architecture for multidimensional computer adaptive test with educational purposes. Proceedings of the 18th Brazilian symposium on Multimedia and the web - WebMedia '12. New York: ACM Press, 2012. p. 17-24.
Piton-Goncalves. Desafios e perspectivas da implementação computacional de testes adaptativos multidimensionais para avaliações educacionais. P.h.d. Thesis (Portuguese). Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2012.
Plake, B. S. (1993). Applications of educational measurement: Is optimum optimal? : Educational Measurement: Issues and Practice Vol 12(1) Spr 1993, 5-10.
Plew, G. T. (1990). An empirical investigation of major adaptive testing methodologies and an expert systems approach: Dissertation Abstracts International.
Ponsoda, V. (2000). Overview of the computerized adaptive testing special section: Psicologica Vol 21(1-2) 2000, 115-120.
Ponsoda, V., Olea, J., Rodriguez, M. S., & Revuelta, J. (1999). The effects of test difficulty manipulation in computerized adaptive testing and self-adapted testing: Applied Measurement in Education Vol 12(2) 1999, 167-184.
Ponsoda, V., Wise, S. L., Olea, J., & Revuelta, J. (1997). An investigation of self-adapted testing in a Spanish high school population: Educational and Psychological Measurement Vol 57(2) Apr 1997, 210-221.
Potenza, M. T. (1995). The exploration of an alternative method for scoring computer adaptive tests. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Potenza, M. T., & Stocking, M. L. (1997). Flawed items in computerized adaptive testing: Journal of Educational Measurement Vol 34(1) Spr 1997, 79-96.
Rafacz, B., & Hetter, R. D. (1997). ACAP hardware selection, software development, and acceptance testing. Washington, DC: American Psychological Association.
Raiche, G., & Blais, J.-G. (2006). SIMCAT 1.0: A SAS Computer Program for Simulating Computer Adaptive Testing: Applied Psychological Measurement Vol 30(1) Jan 2006, 60-61.
Rammsayer, T., & Brandler, S. (2003). Timing behavior in computerized adaptive testing: Response times for correct and incorrect answers are not related to general fluid intelligence: Zeitschrift fur Differentielle und Diagnostische Psychologie Vol 24(1) 2003, 57-63.
Reckase, M. D. (1989). Adaptive testing: The evolution of a good idea: Educational Measurement: Issues and Practice Vol 8(3) Fal 1989, 11-15.
Reise, S. P., & Henson, J. M. (2000). Computerization and adaptive administration of the NEO PI-R: Assessment Vol 7(4) Dec 2000, 347-364.
Revicki, D. A., & Cella, D. F. (1997). Health status assessment for the twenty-first century: Item response theory, item banking and computer adaptive testing: Quality of Life Research: An International Journal of Quality of Life Aspects of Treatment, Care & Rehabilitation Vol 6(6) Aug 1997, 595-600.
Revuelta, J. (2004). Estimating Ability and Item-Selection Strategy in Self-Adapted Testing: A Latent Class Approach: Journal of Educational and Behavioral Statistics Vol 29(4) Win 2004, 379-396.
Rizavi, S. M. (2002). The effect of test characteristics on aberrant response patterns in computer adaptive testing. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Rocklin, T. (1997). Self-adapted testing: Improving performance by modifying tests instead of examinees: Anxiety, Stress & Coping: An International Journal Vol 10(1) 1997, 83-104.
Rocklin, T. R. (1994). Self-adapted testing: Applied Measurement in Education Vol 7(1) 1994, 3-14.
Rocklin, T. R., O'Donnell, A. M., & Holst, P. M. (1995). Effects and underlying mechanisms of self-adapted testing: Journal of Educational Psychology Vol 87(1) Mar 1995, 103-116.
Roos, L. L., Wise, S. L., & Plake, B. S. (1997). The role of item feedback in self-adapted testing: Educational and Psychological Measurement Vol 57(1) Feb 1997, 85-98.
Roos, L. L., Wise, S. L., Yoes, M. E., & Rocklin, T. R. (1996). Conducting self-adapted testing using MicroCAT: Educational and Psychological Measurement Vol 56(5) Oct 1996, 821-827.
Ruscio, J., & Ruscio, A. M. (2002). A structure-based approach to psychological measurement: Matching measurement models to latent structure: Assessment Vol 9(1) Mar 2002, 4-16.
Samejima, F. (1988). Comprehensive latent trait theory: Behaviormetrika Vol 24(24) Jul 1988, 1-24.
Sandknop, P. A., Schuster, J. W., Wolery, M., & Cross, D. P. (1992). The use of an adaptive device to teach students with moderate mental retardation to select lower priced grocery items: Education & Training in Mental Retardation Vol 27(3) Sep 1992, 219-229.
Sands, W. A., Gade, P. A., & Knapp, D. J. (1997). The Computerized Adaptive-Screening Test. Washington, DC: American Psychological Association.
Sands, W. A., & Waters, B. K. (1997). Introduction to ASVAB and CAT. Washington, DC: American Psychological Association.
Sands, W. A., Waters, B. K., & McBride, J. R. (1997). Computerized adaptive testing: From inquiry to operation. Washington, DC: American Psychological Association.
Schneider, R. J., Goff, M., Anderson, S., & Borman, W. C. (2003). Computerized adaptive rating scales for measuring managerial performance: International Journal of Selection and Assessment Vol 11(2-3) Jun-Sep 2003, 237-246.
Schnipke, D. L., & Green, B. F. (1995). A comparison of item selection routines in linear and adaptive tests: Journal of Educational Measurement Vol 32(3) Fal 1995, 227-242.
Schoonman, W. (1989). An applied study on computerized adaptive testing. Lisse, Netherlands: Swets & Zeitlinger Publishers.
Schreiber, M., Schneider, R., Schweizer, A., Beckmann, J. F., & Baltissen, R. (2000). Diagnostic programs in the early detection of dementia: The Adaptive Figure Series Learning Test (ADAFI): Zeitschrift fur Gerontopsychologie & -psychiatrie Vol 13(1) Mar 2000, 16-29.
Schwartz, C., Welch, G., Santiago-Kelley, P., Bode, R., & Sun, X. (2006). Computerized adaptive testing of diabetes impact: A feasibility study of Hispanics and non-Hispanics in an active clinic population: Quality of Life Research: An International Journal of Quality of Life Aspects of Treatment, Care & Rehabilitation Vol 15(9) Nov 2006, 1503-1518.
Segall, D. O. (1996). Multidimensional adaptive testing: Psychometrika Vol 61(2) Jun 1996, 331-354.
Segall, D. O. (1997). Equating the CAT-ASVAB. Washington, DC: American Psychological Association.
Segall, D. O. (1997). The psychometric comparability of computer hardware. Washington, DC: American Psychological Association.
Segall, D. O. (2001). General ability measurement: An application of multidimensional item response theory: Psychometrika Vol 66(1) Mar 2001, 79-97.
Segall, D. O. (2004). A Sharing Item Response Theory Model for Computerized Adaptive Testing: Journal of Educational and Behavioral Statistics Vol 29(4) Win 2004, 439-460.
Segall, D. O., & Moreno, K. E. (1997). Current and future challenges. Washington, DC: American Psychological Association.
Segall, D. O., & Moreno, K. E. (1999). Development of the Computerized Adaptive Testing version of the Armed Services Vocational Aptitude Battery. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Segall, D. O., Moreno, K. E., Bloxom, B. M., & Hetter, R. D. (1997). Psychometric procedures for administering CAT-ASVAB. Washington, DC: American Psychological Association.
Segall, D. O., Moreno, K. E., & Hetter, R. D. (1997). Item pool development and evaluation. Washington, DC: American Psychological Association.
Segall, D. O., Moreno, K. E., Kieckhaefer, W. F., Vicino, F. L., & McBride, J. R. (1997). Validation of the experimental CAT-ASVAB system. Washington, DC: American Psychological Association.
Shibayama, T., Noguchi, H., Shiba, S., & Kambara, M. (1987). An adaptive testing procedure for measuring verbal ability: Japanese Journal of Educational Psychology Vol 35(4) Dec 1987, 363-367.
Shourie, S., Conigrave, K. M., Proude, E. M., Ward, J. E., Wutzke, S. E., & Haber, P. S. (2006). The effectiveness of a tailored intervention for excessive alcohol consumption prior to elective surgery: Alcohol and Alcoholism Vol 41(6) Nov-Dec 2006, 643-649.
Simms, L. J. (2003). Development, reliability, and validity of a computerized adaptive version of the schedule for nonadaptive and adaptive personality. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Simms, L. J., & Clark, L. A. (2005). Validation of a Computerized Adaptive Version of the Schedule for Nonadaptive and Adaptive Personality (SNAP): Psychological Assessment Vol 17(1) Mar 2005, 28-43.
Singh, J., Howell, R. D., & Rhoads, G. K. (1990). Adaptive designs for Likert-type data: An approach for implementing marketing surveys: Journal of Marketing Research Vol 27(3) Aug 1990, 304-321.
Singh, J., Rhoads, G. K., & Howell, R. D. (1992). Adapting marketing surveys to individual respondents: Journal of the Market Research Society Vol 34(2) Apr 1992, 125-147.
Sireci, S. G., & Clauser, B. E. (2001). Practical issues in setting standards on computerized adaptive tests. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Stahl, J., Bergstrom, B., & Gershon, R. (2000). CAT administration of language placement examinations: Journal of Applied Measurement Vol 1(3) 2000, 292-302.
Stark, S., & Chernyshenko, O. S. (2006). Multistage Testing: Widely or Narrowly Applicable? : Applied Measurement in Education Vol 19(3) 2006, 257-260.
Steinberg, L., Thissen, D., & Wainer, H. (2000). Validity. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Stevens, R. F. (1985). An on-line version of the Personal Relations Index psychological test: International Journal of Man-Machine Studies Vol 23(5) Nov 1985, 563-585.
Stocking, M. L. (1987). Two simulated feasibility studies in computerised adaptive testing: Applied Psychology: An International Review Vol 36(3-4) Sep 1987, 263-277.
Stocking, M. L. (1996). An alternative method for scoring adaptive tests: Journal of Educational and Behavioral Statistics Vol 21(4) Win 1996, 365-389.
Stocking, M. L. (1997). Revising item responses in computerized adaptive tests: A comparison of three models: Applied Psychological Measurement Vol 21(2) Jun 1997, 129-142.
Stocking, M. L., & Lewis, C. (1998). Controlling item exposure conditional on ability in computerized adaptive testing: Journal of Educational and Behavioral Statistics Vol 23(1) Spr 1998, 57-75.
Stocking, M. L., & Swanson, L. (1998). Optimal design of item banks for computerized adaptive tests: Applied Psychological Measurement Vol 22(3) Sep 1998, 271-279.
Stocking, M. L., Ward, W. C., & Potenza, M. T. (1998). Simulating the use of disclosed items in computerized adaptive testing: Journal of Educational Measurement Vol 35(1) Spr 1998, 48-68.
Stone, G. E., & Lunz, M. E. (1994). The effect of review on the psychometric characterstics of computerized adaptive tests: Applied Measurement in Education Vol 7(3) 1994, 211-222.
Styles, I., & Andrich, D. (1993). Linking the standard and advanced forms of the Raven's Progressive Matrices in both the pencil-and-paper and computer-adaptive-testing formats: Educational and Psychological Measurement Vol 53(4) Win 1993, 905-925.
Taira, N., Takei, S., & Ogino, M. (1989). Measurement of infants' and toddlers' language comprehension by an adaptive questionnaire: Japanese Journal of Educational Psychology Vol 37(4) Dec 1989, 392-399.
Tatsuoka, K., & Birenbaum, M. (1981). Effects of instructional backgrounds on test performances: Journal of Computer-Based Instruction Vol 8(1) Aug 1981, 1-8.
Tatsuoka, K. K., & Tatsuoka, M. M. (1997). Computerized cognitive diagnostic adaptive testing: Effect on remedial instruction as empirical validation: Journal of Educational Measurement Vol 34(1) Spr 1997, 3-20.
Theunissen, T. J. (1986). Some applications of optimization algorithms in test design and adaptive testing: Applied Psychological Measurement Vol 10(4) Dec 1986, 381-389.
Thissen, D. (2000). Reliability and measurement precision. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Thomas, T. J. (1990). Item-presentation controls for multidimensional item pools in computerized adaptive testing: Behavior Research Methods, Instruments & Computers Vol 22(2) Apr 1990, 247-252.
Thomas, T. J. (1991). Computer-based adaptive mastery testing in multiple content areas: New procedures for mastery testing: Dissertation Abstracts International.
Thompson, N. A. (2007). A comparison of two methods of polytomous computerized classification testing for multiple cutscores. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Tonidandel, S. (2002). Computer adaptive testing: The impact of test characteristics on perceived performance and test takers' reactions. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Tonidandel, S., & Quinones, M. (2000). Psychological reactions to adaptive testing: International Journal of Selection and Assessment Vol 8(1) Mar 2000, 7-15.
Trumbly, J. E., Arnett, K. P., & Johnson, P. C. (1994). Productivity gains via an adaptive user interface: An empirical analysis: International Journal of Human-Computer Studies Vol 40(1) Jan 1994, 63-81.
Tseng, F.-L. (2001). Multidimensional adaptive testing using the weighted likelihood estimation. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Unpingco, V., Hom, I., & Rafacz, B. (1997). Development of a system for nationwide implementation. Washington, DC: American Psychological Association.
van der Linden, W. (1999). Empirical initialization of the trait estimator in adaptive testing: Applied Psychological Measurement Vol 23(1) Mar 1999, 21-29.
van der Linden, W. (1999). "Empirical initialization of the trait estimator in adaptive testing": Errata: Applied Psychological Measurement Vol 23(3) Sep 1999, 248.
van der Linden, W. J. (1995). Advances in computer applications. New York, NY: Kluwer Academic/Plenum Publishers.
van der Linden, W. J. (1998). Bayesian item selection criteria for adaptive testing: Psychometrika Vol 63(2) Jun 1998, 201-216.
van der Linden, W. J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion: Journal of Educational and Behavioral Statistics Vol 24(4) Win 1999, 398-412.
van der Linden, W. J. (2003). Some Alternatives to Sympson-Hetter Item-Exposure Control in Computerized Adaptive Testing: Journal of Educational and Behavioral Statistics Vol 28(3) Fal 2003, 249-265.
van der Linden, W. J. (2005). A Comparison of Item-Selection Methods for Adaptive Tests with Content Constraints: Journal of Educational Measurement Vol 42(3) Fal 2005, 283-302.
van der Linden, W. J. (2006). Model-Based Innovations in Computer-Based Testing. New York, NY: John Wiley & Sons Ltd.
van der Linden, W. J., Ariel, A., & Veldkamp, B. P. (2006). Assembling a computerized adaptive testing item pool as a set of linear tests: Journal of Educational and Behavioral Statistics Vol 31(1) 2006, 81-99.
Van Der Linden, W. J., & Chang, H.-H. (2003). Implementing content constraints in alpha-stratified adaptive testing using a shadow trust approach: Applied Psychological Measurement Vol 27(2) Mar 2003, 107-120.
van der Linden, W. J., & Glas, C. A. W. (2000). Capitalization on item calibration error in adaptive testing: Applied Measurement in Education Vol 13(1) 2000, 35-53.
van der Linden, W. J., & Reese, L. M. (1998). A model for optimal constrained adaptive testing: Applied Psychological Measurement Vol 22(3) Sep 1998, 259-270.
van der Linden, W. J., Scrams, D. J., & Schnipke, D. L. (1999). Using response-time constraints to control for differential speededness in computerized adaptive testing: Applied Psychological Measurement Vol 23(3) Sep 1999, 195-210.
van der Linden, W. J., & van Krimpen-Stoop, E. M. L. A. (2003). Using response times to detect aberrant responses in computerized adaptive testing: Psychometrika Vol 68(2) Jun 2003, 251-265.
van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining Item Exposure in Computerized Adaptive Testing With Shadow Tests: Journal of Educational and Behavioral Statistics Vol 29(3) Fal 2004, 273-291.
van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (1999). The null distribution of person-fit statistics for conventional and adaptive tests: Applied Psychological Measurement Vol 23(4) Dec 1999, 327-345.
van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (2001). CUSUM-based person-fit statistics for adaptive testing: Journal of Educational and Behavioral Statistics Vol 26(2) Sum 2001, 199-218.
van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (2002). Detection of person misfit in computerized adaptive tests with polytomous items: Applied Psychological Measurement Vol 26(2) Jun 2002, 164-180.
van Rijn, P. W., Eggen, T. J. H. M., Hemker, B. T., & Sanders, P. F. (2002). Evaluation of selection procedures for computerized adaptive testing with polytomous items: Applied Psychological Measurement Vol 26(4) Dec 2002, 393-411.
Vas, R. (2007). Educational ontology and knowledge testing: Electronic Journal of Knowledge Management Vol 5(1) Feb 2007, 123-130.
Veerkamp, W. J. J. (2000). Taylor approximations to logistic IRT models and their use in adaptive testing: Journal of Educational and Behavioral Statistics Vol 25(3) Fal 2000, 307-343.
Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing: Journal of Educational and Behavioral Statistics Vol 22(2) Sum 1997, 203-226.
Veerkamp, W. J. J., & Berger, M. P. F. (1999). Optimal item discrimination and maximum information for logistic IRT models: Applied Psychological Measurement Vol 23(1) Mar 1999, 31-40.
Veerkamp, W. J. J., & Glas, C. A. W. (2000). Detection of known items in adaptative testing with a statistical quality control method: Journal of Educational and Behavioral Statistics Vol 25(4) Win 2000, 373-389.
Veldkamp, B. P., & van der Linden, W. J. (2002). Multidimensional adaptive testing with constraints on test content: Psychometrika Vol 67(4) Dec 2002, 575-588.
Vicino, F. L., & Moreno, K. E. (1997). Human factors in the CAT system: A pilot study. Washington, DC: American Psychological Association.
Vispoel, W. P. (1988). An adaptive test of musical memory: An application of item response theory to the assessment of musical ability: Dissertation Abstracts International.
Vispoel, W. P. (1992). Improving the measurement of tonal memory with computerized adaptive tests: Psychomusicology Vol 11(1) Spr 1992, 27-43.
Vispoel, W. P. (1998). Reviewing and changing answers on computer-adaptive and self-adaptive vocabulary tests: Journal of Educational Measurement Vol 35(4) Win 1998, 328-345.
Vispoel, W. P. (1999). Creating computerized adaptive tests of music aptitude: Problems, solutions, and future directions. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Vispoel, W. P., Clough, S. J., & Bleiler, T. (2005). A closer look at using judgments of item difficulty to change answers on computerized adaptive tests: Journal of Educational Measurement Vol 42(4) Win 2005, 331-350.
Vispoel, W. P., Clough, S. J., Bleiler, T., Hendrickson, A. B., & Ihrig, D. (2002). Can examinees use judgments of items difficulty to improve estimates on computerized adaptive vocabulary tests: Journal of Educational Measurement Vol 39(4) Win 2002, 311-330.
Vispoel, W. P., & Coffman, D. D. (1994). Computerized-adaptive and self-adapted music-listening tests: Psychometric features and motivational benefits: Applied Measurement in Education Vol 7(1) 1994, 25-51.
Vispoel, W. P., Rocklin, T. R., & Wang, T. (1994). Individual differences and test administration procedures: A comparison of fixed-item, computerized-adaptive, self-adapted testing: Applied Measurement in Education Vol 7(1) 1994, 53-79.
Vispoel, W. P., Rocklin, T. R., Wang, T., & Bleiler, T. (1999). Can examinees use a review option to obtain positively biased ability estimates on a computerized adaptive test? : Journal of Educational Measurement Vol 36(2) Sum 1999, 141-157.
Vispoel, W. P., Wang, T., & Bleiler, T. (1997). Computerized adaptive and fixed-item testing of music listening skill: A comparison of efficiency, precision, and concurrent validity: Journal of Educational Measurement Vol 34(1) Spr 1997, 43-63.
Vos, H. J. (2000). A Bayesian procedure in the context of sequential mastery testing: Psicologica Vol 21(1-2) 2000, 191-211.
Wainer, H. (1992). "A Harmless Necessary CAT." PsycCRITIQUES Vol 37 (2), Feb, 1992.
Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format: Educational Measurement: Issues and Practice Vol 12(1) Spr 1993, 15-20.
Wainer, H. (2000). CATs: Whither and whence: Psicologica Vol 21(1-2) 2000, 121-133.
Wainer, H. (2000). Computerized adaptive testing: A primer (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Wainer, H., Kaplan, B., & Lewis, C. (1992). A comparison of the performance of simulated hierarchical and linear testlets: Journal of Educational Measurement Vol 29(3) Fal 1992, 243-251.
Wainer, H., & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets: Journal of Educational Measurement Vol 24(3) Fal 1987, 185-201.
Wainer, H., & Mislevy, R. J. (2000). Item response theory, item calibration, and proficiency estimation. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Walker, C. M., Beretvas, S. N., & Ackerman, T. (2001). An examination of conditioning variables used in computer adaptive testing for DIF analyses: Applied Measurement in Education Vol 14(1) 2001, 3-16.
Waller, N. G., & Reise, S. P. (1989). Computerized adaptive personality assessment: An illustration with the Absorption scale: Journal of Personality and Social Psychology Vol 57(6) Dec 1989, 1051-1058.
Walter, O. B., Becker, J., Fliege, H., Bjorner, J., Kosinski, M., Walter, M., et al. (2005). Developmental steps for a computer-adapted test for anxiety: Diagnostica Vol 51(2) 2005, 88-100.
Wang, K. (1997). Computerized adaptive testing: A comparison of item response theoretic approach and expert systems approaches in polychotomous grading. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Wang, L., & Li, C.-S. (2001). Polytomous modeling of cognitive errors in computer adaptive testing: Journal of Applied Measurement Vol 2(4) 2001, 356-378.
Wang, S. (2000). The accuracy of ability estimation methods for computerized adaptive testing using the generalized partial credit model. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Wang, S., & Wang, T. (2002). Precision of Warm's weighted likelihood estimation of ability for a polytomous model in CAT. Hauppauge, NY: Nova Science Publishers.
Wang, T., Hanson, B. A., & Lau, C.-M. A. (1999). Reducing bias in CAT trait estimation: A comparison of approaches: Applied Psychological Measurement Vol 23(3) Sep 1999, 263-278.
Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and an example: Journal of Educational Measurement Vol 38(1) Spr 2001, 19-49.
Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing: Journal of Educational Measurement Vol 35(2) Sum 1998, 109-135.
Wang, W.-C., & Chen, P.-H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing: Applied Psychological Measurement Vol 28(5) Sep 2004, 295-316.
Ware, J. E., Jr., Gandek, B., Sinclair, S. J., & Bjorner, J. B. (2005). Item Response Theory and Computerized Adaptive Testing: Implications for Outcomes Measurement in Rehabilitation: Rehabilitation Psychology Vol 50(1) Feb 2005, 71-78.
Weiss, D. J. (1985). Adaptive testing by computer: Journal of Consulting and Clinical Psychology Vol 53(6) Dec 1985, 774-789.
Weiss, D. J. (1995). Improving individual differences measurement with item response theory and computerized adaptive testing. Palo Alto, CA: Davies-Black Publishing.
Weiss, D. J. (2004). Computerized Adaptive Testing for Effective and Efficient Measurement in Counseling and Education: Measurement and Evaluation in Counseling and Development Vol 37(2) Jul 2004, 70-84.
Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems: Journal of Educational Measurement Vol 21(4) Win 1984, 361-375.
Weiss, D. J., & McBride, J. R. (1984). Bias and information of Bayesian adaptive testing: Applied Psychological Measurement Vol 8(3) Sum 1984, 273-285.
Weiss, D. J., & Vale, C. D. (1987). Adaptive testing: Applied Psychology: An International Review Vol 36(3-4) Sep 1987, 249-262.
Weissman, A. (2003). Assessing the efficiency of item selection in computerized adaptive testing. Dissertation Abstracts International Section A: Humanities and Social Sciences.
Weissman, A. (2006). A Feedback Control Strategy for Enhancing Item Selection Efficiency in Computerized Adaptive Testing: Applied Psychological Measurement Vol 30(2) Mar 2006, 84-99.
Weissman, A. (2007). Mutual Information Item Selection in Adaptive Classification Testing: Educational and Psychological Measurement Vol 67(1) Feb 2007, 41-58.
Welch, R. E., & Frick, T. W. (1993). Computerized adaptive testing in instructional settings: Educational Technology Research and Development Vol 41(3) 1993, 47-62.
Willse, J. T. (2002). Controlling computer adaptive testing's capitalization on chance errors in item parameter estimates. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Wise, L. L., Curran, L. T., & McBride, J. R. (1997). CAT-ASVAB cost and benefit analyses. Washington, DC: American Psychological Association.
Wise, S. L. (1994). Understanding self-adapted testing: The perceived control hypothesis: Applied Measurement in Education Vol 7(1) 1994, 15-24.
Wise, S. L., Finney, S. J., Enders, C. K., Freeman, S. A., & Severance, D. D. (1999). Examinee judgments of changes in item difficulty: Implications for item review in computerized adaptive testing: Applied Measurement in Education Vol 12(2) 1999, 185-198.
Wise, S. L., & Kingsbury, G. G. (2000). Practical issues in developing and maintaining a computerized adaptive testing program: Psicologica Vol 21(1-2) 2000, 135-155.
Wise, S. L., Plake, B. S., Johnson, P. L., & Roos, L. L. (1992). A comparison of self-adapted and computerized adaptive tests: Journal of Educational Measurement Vol 29(4) Win 1992, 329-339.
Wise, S. L., Roos, L. L., Plake, B. S., & Nebelsick-Gullett, L. J. (1994). The relationship between examinee anxiety and preference for self-adapted testing: Applied Measurement in Education Vol 7(1) 1994, 81-91.
Wiskoff, M. F. (1997). R&D laboratory management perspective. Washington, DC: American Psychological Association.
Wisniewski, D. R. (1986). An application of the Rasch model to computerized adaptive testing: The Binary Search Method: Dissertation Abstracts International.
Wolfe, J. H., Alderton, D. L., Larson, G. E., Bloxom, B. M., & Wise, L. L. (1997). Expanding the content of CAT-ASVAB: New tests and their validity. Washington, DC: American Psychological Association.
Wolfe, J. H., McBride, J. R., & Sympson, J. B. (1997). Development of the experimental CAT-ASVAB system. Washington, DC: American Psychological Association.
Wolfe, J. H., Moreno, K. E., & Segall, D. O. (1997). Evaluating the predictive validity of CAT-ASVAB. Washington, DC: American Psychological Association.
Xiao, B. (1993). Strategies for computerized adaptive testing: Golden section search, dichotomous search, and Z-score strategies: Dissertation Abstracts International.
Xiao, B. (1999). Strategies for computerized adaptive grading testing: Applied Psychological Measurement Vol 23(2) Jun 1999, 136-146.
Xu, X., & Douglas, J. (2006). Computerized adaptive testing under nonparametric IRT models: Psychometrika Vol 71(1) Mar 2006, 121-137.
Yan, D., Lewis, C., & Stocking, M. (2004). Adaptive Testing With Regression Trees in the Presence of Multidimensionality: Journal of Educational and Behavioral Statistics Vol 29(3) Fal 2004, 293-316.
Yi, Q., & Chang, H.-H. (2003). a-Stratified CAT design with content blocking: British Journal of Mathematical and Statistical Psychology Vol 56(2) Nov 2003, 359-378.
Yi, Q., Wang, T., & Ban, J.-C. (2001). Effects of scale transformation and test-termination rule on the precision of ability estimation in computerized adaptive testing: Journal of Educational Measurement Vol 38(3) Fal 2001, 267-292.
Yi, Q., Zhang, J., & Chang, H.-H. (2006). Assessing CAT Test Security Severity: Applied Psychological Measurement Vol 30(1) Jan 2006, 62-63.
Yuan, H. (1999). Comparative study of adaptive psychophysical procedures.(threshold estimation, maximum likelihood). Dissertation Abstracts International: Section B: The Sciences and Engineering.
Zhao, J. C. (2000). The robustness of the unidimensional 3pl IRT model when applied to two-dimensional data in computerized adaptive testing. Dissertation Abstracts International: Section B: The Sciences and Engineering.
Zickar, M. J. (1998). Modeling item-level data with item response theory: Current Directions in Psychological Science Vol 7(4) Aug 1998, 104-109.
Zickar, M. J., Overton, R. C., Taylor, L. R., & Harms, H. J. (1999). The development of a computerized selection system for computer programmers in a financial services company. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Zuo, Y. (2003). Finite sample tail behavior of multivariate location estimators: Journal of Multivariate Analysis Vol 85(1) Apr 2003, 91-105.
Zwick, R. (1997). The effect of adaptive administration on the variability of the Mantel-Haenszel measure of differential item functioning: Educational and Psychological Measurement Vol 57(3) Jun 1997, 412-421.
Zwick, R., & Thayer, D. T. (2002). Application of an empirical Bayes enhancement of Mantel-Haenszel differential item functioning analysis to a computerized adaptive test: Applied Psychological Measurement Vol 26(1) Mar 2002, 57-76.

External links

CAT Central by David J. Weiss
GMAT Daily Tips: Introduction to the GMAT Computer Adaptive Test by Jeff Sackmann
Frequently Asked Questions about Computer-Adaptive Testing (CAT). Retrieved April 15, 2005.
An On-line, Interactive, Computer Adaptive Testing Tutorial by Lawrence L. Rudner. November 1998. Retrieved April 15, 2005.
Special issue: An introduction to multistage testing. Applied Measurement in Education, 19(3).

This page uses Creative Commons Licensed content from Wikipedia (view authors).

[WeissKingsbury-1] 1.00 ^1.01 ^1.02 ^1.03 ^1.04 ^1.05 ^1.06 ^1.07 ^1.08 ^1.09 ^1.10 Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.

[ThissenMislevy-2] 2.0 ^2.1 ^2.2 ^2.3 ^2.4 ^2.5 ^2.6 ^2.7 ^2.8 ^2.9 Thissen, D., & Mislevy, R.J. (2000). Testing Algorithms. In Wainer, H. (Ed.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates.

[Green-3] Green, B.F. (2000). System design and operation. In Wainer, H. (Ed.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates.

[WainerMislevy-4] 4.0 ^4.1 ^4.2 ^4.3 Wainer, H., & Mislevy, R.J. (2000). Item response theory, calibration, and estimation. In Wainer, H. (Ed.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates. Cite error: Invalid <ref> tag; name "WainerMislevy" defined multiple times with different content

[LinSpray2000-5] 5.0 ^5.1 Lin, C.-J. & Spray, J.A. (2000). Effects of item-selection criteria on classification testing with the sequential probability ratio test. (Research Report 2000-8). Iowa City, IA: ACT, Inc.

[Wald-6] Wald, A. (1947). Sequential analysis. New York: Wiley.

[Reckase-7] Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). New York: Academic Press.

[Weitzman-8] Weitzman, R. A. (1982). Sequential testing for selection. Applied Psychological Measurement, 6, 337-351.

[KingsburyWeiss-9] 9.0 ^9.1 Kingsbury, G.G., & Weiss, D.J. (1983). A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). New York: Academic Press.

[EggenStraetmans-10] 10.0 ^10.1 ^10.2 Eggen, T. J. H. M, & Straetmans, G. J. J. M. (2000). Computerized adaptive testing for classifying examinees into three categories. Educational and Psychological Measurement, 60, 713-734.

[SprayReckase-11] Spray, J. A., & Reckase, M. D. (1994). The selection of test items for decision making with a computerized adaptive test. Paper presented at the Annual Meeting of the National Council for Measurement in Education (New Orleans, LA, April 5-7, 1994).

[SympsonHetter-12] Sympson, B.J., & Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. Paper presented at the annual conference of the Military Testing Association, San Diego.

[vanderLinden-13] For example: van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29, 273‑291.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]