Quasi-experiments and Education
In Bryan Caplan’s book The Case Against Education (which I’ve written about before here), he argues that the wage premium from education is mostly due to the fact that having educational qualifications signals that someone is likely to be a good employee. This is the argument against the human capital thesis, which is the claim that people actually learn useful skills from going to school/university, and that’s why people with educational qualifications earn more money. You can read a useful review of the book here for a more detailed summary of Caplan’s argument.
At the beginning of the book, he confesses to a decision: when there isn’t experimental evidence available, he will put his trust in Ordinary Least Squares with control variables over ‘high-tech alternatives’. What does this mean? ‘Ordinary Least Squares’ (OLS) refers to a technique for measuring the relationship between two variables. For instance, here’s a graph that shows the relationship between weight and height:
So, OLS/Linear Regression helps you find an association between two variables. And using ‘control’ variables helps you to try to determine whether the relationship is causal. Suppose I want to figure out whether going to the opera causes someone to have a high IQ. I might run a linear regression and find that there is a correlation - but in fact, when I control for income, I find that the correlation disappears. Rich people tend to go to the opera, and they also tend to have higher IQs. So, the relationship between going to the opera and IQ wasn’t causal at all - both of those variables are correlated with income, and when I take income into account, the relationship goes away.
The Caplan strategy is to look for correlations and try to control for everything we can think of that might be correlated both with being highly educated and with being richer/more successful. For example, here’s fairly typical passage from The Case Against Education:
“When researchers correct for scores on the Armed Forces Qualification Test (AFQT), an especially high-quality IQ test, the education premium typically declines by 20–30%. Correcting for mathematical ability may tilt the scales even more; the most prominent researchers to do so report a 40–50% decline in the education premium for men and a 30–40% decline for women. Internationally, correcting for cognitive skill cuts the payoff for years of education by 20%, leaving clear rewards of mere years of schooling in all 23 countries studied.”
But is OLS with controls really the best way to make causal inferences when it comes to education if there isn’t experimental evidence available? Probably not. Instead of looking at a correlation and trying (and possibly failing) to control for everything we can think of, we can use quasi-experiments. How do these work? One of the most common quasi-experimental designs is the Regression Discontinuity Design (RDD). The basic idea is that in certain circumstances, an arbitrary cut-off can give you the equivalent of the treatment group and a control group that you get in a randomised control trial.
For example, suppose you want to measure the causal effect of becoming a Member of Parliament on wealth at the time of death. One way to figure this out would be to compare people who won an election by a tiny number of votes to people who lost an election by a tiny number of votes. Someone who became an MP by only 5 votes probably isn’t particularly different to someone who failed to become an MP after losing by only 5 votes, so if you have enough examples of people who won/lost by only a small number of votes, you can isolate the causal effect of becoming an MP on a person’s wealth. If you’re interested, there’s a famous study by Eggers that does exactly this - finding that becoming an MP almost doubled the wealth of Conservative MPs (but had no effect for Labour MPs), as can be seen in the figure below.
But Caplan is surprisingly negative about quasi-experimental designs, claiming that they’re often messy and liable to be manipulated. But I think this is odd: Many quasi-experiments really do replicate pretty well, and probably replicate more than OLS with controls. In an article by Alvaro De Menard detailing how he won $10k by predicting which social science papers would replicate, he notes that he would assign a higher chance of replication to papers that used Regression Discontinuity Designs, as well as those that used a similar design called Difference-in-Differences (although he notes that certain types of quasi-experimental designs, such as Instrumental Variables, are less likely to replicate).
And Regression Discontinuity Designs seem especially well-suited to isolating the causal effects of education. There are a few instances where new educational laws have meant that people born before or after some arbitrary date have received different levels of education - meaning that you can perform an RDD by looking at the difference in outcomes between those who were born just before the cut-off and those who were born just after the cut-off. They’re also good for isolating the effect of getting a qualification - you can look at people who just failed their final exams and compare them to people who just passed their final exams.
Here is a small sample of studies that use RDD:
Banks and Mazzona (2012) exploit a 1947 change to the minimum school leaving age in the United Kingdom, and find a large and significant effect of one year of education on male memory and executive functioning in old age.
Ozier (2018) exploits the fact that the probability of admission to a government secondary school sharply increases if a child gets a score close to the national mean, finding that completing secondary school increases scores on a cognitive ability test by 0.6 standard deviations. Education also seems to increase employment rates and decrease the chance of teen pregnancy among women - although this would be consistent with the signalling thesis.
Clark and Marotell (2014) use a RDD to compare the career earnings of students who barely passed and those who barely failed high school exit exams - those who failed did not get a diploma, but those who passed did (these were at the end of their education, so a big effect here would give credence to the signalling thesis). They find little evidence of there being a signal effect, and you can see the visualisations of the RDD below.
I’m not trying to claim that I’ve gone through the literature and read most of the studies that use RDDs, these were just the first few that I found (excluding the first few that I found that all used the 1947 legislation that the first study linked above uses, which I didn’t include because it would be repetitive). And I’m pretty sure that there are studies that use RDDs that do find a strong signalling component to the wage premium.
But what I do claim is that these studies are useful, and specifically, that they’re useful for isolating the causal effect of education. For the signalling effect especially, you can look at students who just failed final high school exams and compare them to those who just passed the exams.
So I find it odd that Caplan decides to put more weight on OLS with controls because, in his words, ‘it’s easy to understand, easy to compare, and hard to manipulate’. There isn’t any mention (in this part of the book) of the problems with using OLS with controls to try and make causal inferences: omitted variable bias, collider bias, post-treatment conditioning, and so on.
These all seem relevant here! For instance, imagine that you find that the correlation between years of education and wage decreases significantly when you control for cognitive ability - it may be the case that education increases cognitive ability, and cognitive ability increases wages, so controlling for cognitive ability would be a bad idea.
And I’m slightly sceptical that OLS with controls is hard to manipulate, I suspect that p-hacking is harder with quasi-experimental designs. The cynical part of me thinks that the OLS papers indicate that signalling plays more of a role than do the RDD papers, and so the decision to take the OLS papers more seriously is not a decision being made purely due to methodological concerns with RDDs, but I have no real evidence for that cynicism. If you think there really are good reasons to prefer OLS to RDD, do let me know in the comments!
Great article! To me, the main problem with using RDD for the analysis in The Case Against Education is that it seems one could make a strong case that individuals at the margin have a great deal of sway over which side of the educational divide they fall into meaning that the "no sorting across the cutoff" assumption that RDD depends on would be violated.
Makes sense.
Similarly, the sizable drop in the latest NAEP scores suggest that remote learning during the pandemic wasn't as effective as in-school learning, especially not for lower socio-economic status kids, much as you'd expect.
American elites were stunned during WWI and just how uneducated so many draftees were. The interwar years saw a big increase in education and in WWII, American draftees tended to be more competent at keeping a now more technologically complex war machine running.
So, yeah, my counter-contrarian conclusion is that the big increase in education levels between, say, 1860 and 1960 worked pretty well.