Monday, August 4, 2008 - 2:50 PM

OOS 1-5: Quantifying student learning: Type I and type II errors of learning gains

Everett Weber, Murray State University

Background: Learning gains and normalized learning gains are common methods to determine the success of different instructional interventions in research.  Students are not usually randomly selected for courses. Because students are not randomly selected for different courses, pre-existing differences in students may be larger than the treatment effects measured by learning gains thus resulting in spurious results. A strong covariate of learning gains is pre-test score. Incorporating pre-test score as a covariate can reduce false positives and false negatives that occur as a result of non-random sampling of students.  Methods: I used a model of student performance on a pre-test post-test pair to empirically test the impact of pre-test score on ANOVAs of learning gains and normalized learning gains compared to ANCOVAs of post-test scores using pre-test scores as a covariate. Five variables in the model were varied and 100 replicates were run per variable combination resulting in a total of 648,000 course comparisons.
Results: There were large differences in the proportion of modeled course comparisons that were significantly different using ANOVAs of learning gains, ANOVAs of normalized learning gains, and ANCOVAs of post-test scores using pre-test scores as a covariate. ANOVAs showed more type I and Type II errors than did ANCOVAs. The largest differences among the three statistical methods occurred at a moderate difference in intercept measured in the ANCOVA. A moderate difference of 0.1 (scale ranged from 0-1) ANCOVA’s found significant differences among the courses 98% of the time compared to 89% for learning gains and 49% for normalized learning gains (type II errors). When differences in the intercepts were 0 the ANCOVAs found significant differences in 3% of the course comparisons compared to 20% for learning gains and 9% for normalized learning gains (type II errors). Other impacts include significant learning gains in the opposite direction implied by the differences in the intercepts. Other factors that resulted in differences in the three analytical methods included slope of the pre-test post-test regression, difference in pre-test score and lowest pre-test score. Conclusions: The results indicate that unless one randomly assigns students to courses used in a study, comparisons should be made with ANCOVAs using pre-test as a covariate rather than using ANOVAs or  T-tests of learning gains or normalized learning gains. We are currently in the process of determining the boundary conditions where pre-test scores are not a significant factor in distinguishing among these different methods of analysis.