item analysis
arnold on Oct 05, 2008
Statistical properties of the item score are examined through a variety of procedures known as item analysis. Item Analysis furnishes a variety of statistical data regarding how subjects responded to each item and how each item relates to overall performance. If an item analysis is conducted on a preliminary form of a test, the next step is to select the “best” items to be used on the “final” version. The most basic results are the proportions of responses to each alternative. 1) Be suspicious of any item if a distractor is chosen more often than the correct alternative, 2) Distractors that are hardly ever chosen are too transparently incorrect, 3) The proportion choosing the correct alternative is the classical index of item difficulty. Items with extreme p-values should be generally excluded since they do not discriminate among individuals (Information taken from Nunnally and Bernstein, 1994; Crocker and Algina, 1986)
An item analysis must describe how each item relates to overall test performance and thereby provide discrimination indices, of which there are several. (The best items are the most discriminating). 1) Ordinary PM item-total correlation 2) the covariance between an item and total score 3) the average correlation between a given item and all other items and 3) the proportion of people passing the item in the top half of the class minus the proportion of people passing the item in the bottom half of the class.
Another way to look at it:
The pool of items is administered to a screening sample comprised of subjects similar to those for whom the scale is intended, who respond to each item on a scale indicating varying intensities of approval/disapproval, agreement/disagreement, and the like. The responses of the screening sample are subjected to an item analysis in order to determine the adequacy of the individual items so that the ‘best’ ones may be selected for the inclusion of the scale. Two approaches to item selection is proposed by Likert: a) items that discriminate between groups scoring ‘high’ and ‘low’; or b) items whose correlations with total scores are relatively high are selected (NB). (Pedhazur & Schmelkin, 1991)
Item analysis is the computation and examination of any statistical property of an item response distribution. For dichotomously scored items, the best known descriptor is probably item difficulty (p), which denotes the proportion of examinees answering the item correctly. For dichotomous items value close .50, for multiple choice items p-value greater than .50 are ideal. (CA) NB warns us not to use items with too high p scores though, as they will not have any distinguishing value attached to them. The best items on a test are the ones best discriminating. The simplest discriminating index is the ordinary PM item-total correlation (r) between each item and the total test score (NB). (Crocker and Algina, 1986; Nunnally and Bernstein, 1994)