Whenever I talk to educators and mention effect sizes, someone inevitably complains. “We don’t understand effect sizes,” they say. I always explain that you don’t have to understand exactly what effect sizes are, but if you do know that more of them are good and less of them are bad, assuming that the research from which they came is of equal quality, then why do you have to know precisely what they are? Sometimes I mention the car reliability rating system Consumer Reports uses, with full red circles at the top and full black circles at the bottom. Does anyone understand how they arrived at those ratings? I don’t even know, but I don’t care, because like everyone else, what I do know is that I don’t want a car with a reliability rating in the black.
People always tell me that they would like it better if I’d use “additional months of gain.” I do this when I have to, but I really do not like it, because these “months of gain” do not really mean very much, and they work very differently at the early elementary grades than they do in high schools.
Clik here to view.

So here is an idea that some people might find useful. The National Assessment of Educational Progress (NAEP) uses reading and math scales that have a theoretical standard deviation of 50. So an effect size of, say, +0.20 can be expressed as a gain equivalent to a NAEP score gain of +10 (0.20 x 50 = 10) points. That’s not really interesting yet, because most people also don’t know what NAEP scores mean.
But here’s another way to use such data that might be more fun and easier to understand. I think people could understand and care about their state’s rank on NAEP scores. So for example, the highest-scoring state on 4th grade reading is Massachusetts, with a NAEP reading score of 231 in 2019. What if the 13th state, Nebraska (222), adopted a great reading program statewide, and it gained an average effect size of +0.20. That’s equivalent to 10 NAEP points. Such a gain in effect size would make Nebraska score one point ahead of Massachusetts (if Massachusetts didn’t change). Number 1!
If we learned to speak in terms of how many ranks states would gain if they gained a given effect size, I wonder if that would give educators more understanding and respect for the findings of experiments. Even fairly small effect sizes, if replicated across a whole state, could propel a state past its traditional rivals. For example, 26th ranked Wisconsin (220) could equal neighboring 12th ranked Minnesota (222) with a statewide reading effect size gain of only +0.04. As a practical matter, Wisconsin could increase its fourth grade test scores by an effect size of +0.04, perhaps by using a program with an effect size of +0.20 with (say) the lowest-achieving fifth of its fourth graders.
If only one could get states thinking this way, the meaning and importance of effect sizes would soon become clear. And as a side benefit, perhaps if Wisconsin invested its enthusiasm and money in a “Beat Minnesota” reading campaign, as it does to try to beat the University of Minnesota’s football team, Wisconsin’s students might actually benefit. I can hear it now:
On Wisconsin, On Wisconsin,
Raise effect size high!
We are not such lazy loafers
We can beat the Golden Gophers
Point-oh-four or point-oh-eight
We’ll surpass them, just you wait!
Well, a nerd can dream, can’t he?
_______
Note: No states were harmed in the writing of this blog.
This blog was developed with support from Arnold Ventures. The views expressed here do not necessarily reflect those of Arnold Ventures.
Note: If you would like to subscribe to Robert Slavin’s weekly blogs, just send your email address to thebee@bestevidence.org.