Improving the multiple-choice examination question: The assessment and use of partial knowledge

This study attempted to evaluate modifications of the multiple-choice examination question using the criteria of validity, practicality, and student acceptance. Validity was assessed with regard to the separate issues of relevance and reliability. Test relevance was examined in view of possible influences of risk-taking tendency and capacity to represent levels of partial knowledge. Test reliability was measured directly using Hoyt’s (1941) procedure and indirectly by assessing the amount of guessing behavior. The time required to complete the quizzes served as a practicality measure. Student acceptance was measured by a questionnaire at the termination of the experiment. 40 Ss were matched for academic ability and then assigned to four testing groups: conventional testing, correction testing, probability testing, and confidence weighting. The groups were defined in terms of two independent variables which were penalty for incorrect responses and weighing of credit for correct responses.Prior to the start of the five week series of quizzes, each i completed the Choice Dilemmas Questionnaire and the Risk-Taking on Objective Examinations test as measures of risk-taking tendency. In addition, all Ss were administered one practice quiz in order to allow some familiarity with the testing procedures. The Risk-Taking on Objective Examinations test was found to be superior to the Choice Dilemmas Questionnaire as a measure of risk-taking tendency. The level of risk taking tendency was not influential on testing behavior in any of the groups. Confidence weighting provided a better representation of partial knowledge than did probability testing. Direct comparison of reliability coefficients was not possible due to the small number of subjects used in this experiment. Confidence weighting proved to be a better suppressor of guessing than was probability testing. There were no significant differences when comparing time in testing for the different procedures. Student acceptance was high for conventional testing and probability testing and was low for correction testing and confidence weighting. A number of suggestions were made for further study.