Project

CAIT: a computer assisted instruction tool

Statement of Problem
 The problem of missing data in statistical analysis is one that the field of social research has failed to adequately address despite its potential to significantly affect results and subsequent substantive conclusions. The purpose of this study is to evaluate the practical application of missing data techniques in reaching substantive sociological conclusions on the basis of statistical analyses with incomplete data sets. This study compares three different methods for handling incomplete data: multiple imputation, direct maximum likelihood, and listwise deletion.
 Sources of Data
 The comparisons are conducted via a reexamination of a multiple regression analysis of the ECLS-K 1998-99 data set by Downey and Pribesh (2004), who reported the results of their study on the effects of teacher and student race on teachers’ evaluations of students’ classroom behavior using multiple imputation to handle missing data.
 vi
 Conclusions Reached
 After comparing the three different methods for handling incomplete data, this study comes to the general conclusion that multiple imputation and direct maximum likelihood will produce equivalent results and arrive at the same substantive sociological conclusions. The current study also found that direct maximum likelihood shared more similarities with listwise deletion than with multiple imputation, which may be the result of differences in data handling by this author and Downey and Pribesh. In general, both direct maximum likelihood and listwise deletion produced increased significance levels and therefore a greater number of statistically significant variables when compared to the multiple imputation results. Still, all three methods produced basically equivalent results. The importance of taking method choice and missing data into careful consideration prior to performing a statistical analysis and drawing subsequent substantive conclusions is also stressed.

Project (M.S., Computer Science) -- California State University, Sacramento, 2009.

Statement of Problem The problem of missing data in statistical analysis is one that the field of social research has failed to adequately address despite its potential to significantly affect results and subsequent substantive conclusions. The purpose of this study is to evaluate the practical application of missing data techniques in reaching substantive sociological conclusions on the basis of statistical analyses with incomplete data sets. This study compares three different methods for handling incomplete data: multiple imputation, direct maximum likelihood, and listwise deletion. Sources of Data The comparisons are conducted via a reexamination of a multiple regression analysis of the ECLS-K 1998-99 data set by Downey and Pribesh (2004), who reported the results of their study on the effects of teacher and student race on teachers’ evaluations of students’ classroom behavior using multiple imputation to handle missing data. vi Conclusions Reached After comparing the three different methods for handling incomplete data, this study comes to the general conclusion that multiple imputation and direct maximum likelihood will produce equivalent results and arrive at the same substantive sociological conclusions. The current study also found that direct maximum likelihood shared more similarities with listwise deletion than with multiple imputation, which may be the result of differences in data handling by this author and Downey and Pribesh. In general, both direct maximum likelihood and listwise deletion produced increased significance levels and therefore a greater number of statistically significant variables when compared to the multiple imputation results. Still, all three methods produced basically equivalent results. The importance of taking method choice and missing data into careful consideration prior to performing a statistical analysis and drawing subsequent substantive conclusions is also stressed.

Relationships

Items