November 30, 2005

The Freakonomics Fiasco summarized

I'm Shocked, Shocked to See This ...

The most celebrated nonfiction book of the year is Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by U. of Chicago superstar economist Steven D. Levitt and journalist Stephen J. Dubner. The most admired aspect of the book has been Levitt's theory that legalizing abortion cut the crime rate. Now, it turns out, according to two economists at the Boston Fed who have finally checked Levitt's calculations in detail, that Levitt's theory is based on two mistakes Levitt made. So far, Levitt admits to making one error, saying it "is personally quite embarrassing."


Ever since my 1999 debate with Levitt in Slate.com, Levitt's fans have been telling me that my simpleminded little graphs and ratios of national-level crime trends showing, for example, that the teen homicide rate tripled in the first cohort born after Roe v. Wade couldn't possibly be right because Levitt's state-level analysis was so much more gloriously, glamorously, incomprehensibly complicated than mine, and Occam's Butterknife says that the guy with the most convoluted argument wins.


This fiasco reveals much about what's wrong with public policy discourse in modern America. Fifteen minutes of Googling would have shown that the abortion-cut-crime theory hadn't come close to meeting the burden of proof, but, instead, much of America's intellectual elite fell head over heels for it. Being largely innumerate and unenterprising, the punditariat is unable or unwilling to apply simple reality checks to complex models. It's easier to simply engage in intellectual hero-worship and take a guru figure like Levitt on faith.

Now, two economists have redone Levitt's work and found two fatal mistakes in it. The WSJ reports:

'Freakonomics' Abortion Research Is Faulted by a Pair of Economists
By JON E. HILSENRATH
Staff Reporter of THE WALL STREET JOURNAL
November 28, 2005; Page A2

Prepare to be second-guessed.

That would have been useful advice for Steven Levitt, the University of Chicago economist and author of the smash-hit book "Freakonomics," which uses statistics to explore the hidden truths of everything from corruption in sumo wrestling to the dangers of owning a swimming pool.

The book's neon-orange cover title advises readers to "prepare to be dazzled," and its sales have lived up to the hype. A million copies of the book are in print. The book, which was written with New York Times writer Stephen Dubner, has been on the New York Times best-seller list for 31 weeks and is atop The Wall Street Journal's list of bestsellers in the business category.

But now economists at the Federal Reserve Bank of Boston are taking aim at the statistics behind one of Mr. Levitt's most controversial chapters. Mr. Levitt asserts there is a link between the legalization of abortion in the early 1970s and the drop in crime rates in the 1990s. Christopher Foote, a senior economist at the Boston Fed, and Christopher Goetz, a research assistant, say the research behind that conclusion is faulty.

Long before he became a best-selling author, Mr. Levitt, 38 years old, had established a reputation among economists as a careful researcher who produced first-rate statistical studies on surprising subjects. In 2003, the American Economic Association named him the nation's best economist under 40, one of the most prestigious distinctions in the field. His abortion research was published in 2001 in the Quarterly Journal of Economics, an academic journal. (He was the subject of a page-one Wall Street Journal story1 in the same year.)

The "Freakonomics" chapter on abortion grew out of statistical studies Mr. Levitt and a co-author, Yale Law School Prof. John Donohue, conducted on the subject. The theory: Unwanted children are more likely to become troubled adolescents, prone to crime and drug use, than are wanted children. When abortion was legalized in the 1970s, a whole generation of unwanted births were averted, leading to a drop in crime nearly two decades later when this phantom generation would have come of age.

The Boston Fed's Mr. Foote says he spotted a missing formula in the programming of Mr. Levitt's original research. He argues the programming oversight made it difficult to pick up other factors that might have influenced crime rates during the 1980s and 1990s, like the crack wave that waxed and waned during that period. He also argues that in producing the research, Mr. Levitt should have counted arrests on a per-capita basis. Instead, he counted overall arrests. After he adjusted for both factors, Mr. Foote says, the abortion effect disappeared. [Emphasis mine.]

"There are no statistical grounds for believing that the hypothetical youths who were aborted as fetuses would have been more likely to commit crimes had they reached maturity than the actual youths who developed from fetuses and carried to term," the authors assert in the report. Their work doesn't represent an official view of the Fed.

Mr. Foote, 40, taught in Harvard's economics department between 1996 and 2002; served stints as an economist on the Council of Economic Advisers in 1994, 1995, 2002 and 2003; and served as an economic adviser to the Coalition Provisional Authority in Baghdad, Iraq, in 2003 and 2004.

Mr. Levitt counters that Mr. Foote is looking only at a narrow subset of his overall work on abortion and crime, so his results are of limited value, and not grounds for dismissing the whole theory. He acknowledges the programming error, but says taken by itself, that error doesn't put much of a dent in his work. (Mr. Foote's result depends on changing that formula and on the adjustment for per-capita arrests.) Moreover, Mr. Levitt says the abortion theory has held up when examined in other countries, like Canada and Australia, and when applied to other subjects, like drug use.

"Does this change my mind on the issue? Absolutely not," Mr. Levitt says. [More]

Levitt and Donohue's abortion-cut-crime theory was put together and tested in a quick and dirty fashion in late 1998, and when their draft paper leaked to the Chicago Tribune in August 1999, they hadn't yet done the needed reality checks on their idea.

In our August 1999 debate in Slate, I pointed out to Levitt that the national-level homicide data easily available on federal government websites showed that his theory had radically failed the test of history: the first cohort born after the legalization of abortion had a homicide rate as 14-17 year olds triple that of the last cohort born before legalization.

Rather than damage his nascent career by expressing doubts about his signature theory, Levitt decided to dig his heels in and rely on his extremely complicated state-level analyses to buffalo people into ignoring my easy-to-understand national-level analyses that raised serious doubts about whether he'd come anywhere near meeting the burden of proof.

Hey, it worked. He's now rich and famous.

I told Levitt last month during the Bill Bennett Brouhaha, in which the former Education Secretary was widely denounced for making a reductio ad absurdum argument based on the racial aspect of Levitt's theory, that he should just walk away now from his most famous theory -- just admit that it's too hard to tell what actually happened. He's now a celebrity so he hardly needs this theory anymore to go on being a celebrity. Otherwise, someday, some little-known economist was going to make his reputation by taking the Freakonomist down. Well, Levitt's nemesis has arrived.


A reader writes:

Will this really matter? I guess I have my doubts. We have moved into an era when facts matter less than myths.

Indeed. Virtually nobody will admit they were wrong about this. Way too many important people have too much invested in Levitt's celebrity. This is a fiasco for the economics profession -- the most famous young economist's most famous theory has been exposed after six years of adulation as being based on something approaching malpractice -- but the likelihood that the economics profession will stage an inquiry must range between zero and negative infinity. I wonder how many economics professors have book proposals in right now for that next bestseller "Berserkonomics"? (By the way, Levitt and Dubner are working on a sequel with a title that reflects their characteristic elegant taste: Superfreakonomics.)

In the general media as well, too many influential people publicly endorsed the theory when a small amount of due diligence with Google would have shown them it was deeply dubious. And too many people want his abortion-cut-crime theory to be true for personal or political reasons. I've noticed, for example, that in online discussions, pro-lifers tend to want Levitt's theory to be true. They appear to want to be able to boast, "Even though legal abortion reduces the likelihood of me being a victim of crime, I'm still against it. That's how idealistic I am."

Here's the abstract of Foote and Goetz's paper:

Testing Economic Hypotheses with State-Level Data: A Comment on Donohue and Levitt (2001) [PDF - full paper]

Working Paper 05-15
by Christopher L. Foote and Christopher F. Goetz

State-level data are often used in the empirical research of both macroeconomists and microeconomists. Using data that follows states over time allows economists to hold constant a host of potentially confounding factors that might contaminate an assignment of cause and effect. A good example is a fascinating paper by Donohue and Levitt (2001, henceforth DL), which purports to show that hypothetical individuals resulting from aborted fetuses, had they been born and developed into youths, would have been more likely to commit crimes than youths resulting from fetuses carried to term. We revisit that paper, showing that the actual implementation of DL’s statistical test in their paper differed from what was described. (Specifically, controls for state-year effects were left out of their regression model.) We show that when DL’s key test is run as described and augmented with state-level population data, evidence for higher per capita criminal propensities among the youths who would have developed, had they not been aborted as fetuses, vanishes. Two lessons for empirical researchers are, first, that controls may impact results in ways that are hard to predict, and second, that these controls are probably not powerful enough to compensate for the omission of a key variable in the regression model. (Data and programs to support this comment are available on the web site of the Federal Reserve Bank of Boston.)


Levitt's reply on his Freakonomics blog is here.


All my posting on this issue are at http://www.iSteve.com/abortion.htm

***Permalink***


Levitt's response to the Freakonomics abortion-cut-crime theory fiasco

Levitt blogs:

Everything in Freakonomics is wrong!

Or at least that is the impression you might get if you read this article in today’s Wall Street Journal.

I will post a longer blog entry once I have had time to fully digest the working paper by Foote and Goetz which is the basis for the article.

For now, I will say just a few things:

1) It is not at all clear from the WSJ article is that Foote and Goetz are talking about only one of the five different pieces of evidence we put forth in our paper. They have no criticisms of the other four approaches, all of which point to the same conclusion.

2) There was a coding error that led the final table of my paper with John Donohue on legalized abortion to have specifications that did not match what we said we did in the text. (We’re still trying to figure out where we went wrong on this.) This is personally quite embarrassing because I pride myself on being careful with data. Still, that embarrassment aside, when you run the specifications we meant to run, you still find big, negative effects of abortion on arrests (although smaller in magnitude than what we report). The good news is that the story we put forth in the paper is not materially changed by the coding error.

3) Only when you make other changes to the specification that Foote and Goetz think are appropriate, do the results weaken further and in some cases disappear. The part of the paper that Foote and Goetz focus on is one that is incredibly demanding of the data. For those of you who are technically minded, our results survive if you include state*age interactions, year*age interactions, and state*year interactions. (We can include all these interactions because we have arrest data by state and single year of age.) Given how imperfect the abortion data are, I think most economists would be shocked that our results stand up to removing all of this variation, not that when you go even further in terms of demands on the data things get very weak.

Again, as I said, I will post again on this subject once I have had a chance to carefully study the details of what they have done, and after I have been able to go back to the raw data and understand why the results change when one does what Foote and Goetz do.

5 COMMENTS » Posted by Steven D. Levitt @ 2:46 pm on Monday, November 28, 2005 in General

In contrast, economist John R. Lott, a longtime critic of Levitt's theory who came in for a half page of ad hominem abuse in Freakonomics, is feeling better than Levitt is today. He blogged:

Christopher L. Foote and Christopher F. Goetz's paper can be found here. Personally, I think calling this a "programming oversight" is being much too nice. More importantly everyone who works with panel data knows that you use fixed effects.

My own work concentrated on murder rates, but I also included fixed effects. Donohue and Levitt never provided us with all their data or their regressions and would never answer any questions that we had so I just assumed that they had included fixed effects from the beginning. It would have been nice if they had provided us with this same information years ago.

Financial economist and blogger Mahalanobis (Michael Stastny) writes:

Levitt's response is on his website (see here) where he notes

The part of the paper that Foote and Goetz focus on is one that is incredibly demanding of the data. For those of you who are technically minded, our results survive if you include state*age interactions, year*age interactions, and state*year interactions.

3 interaction variables are necessary to get the right sign and significance? I think that is very technically demanding. In my experience, interaction variables are kitchen sink type regressors that induce severe multicollinearity and give spurious results. It's like an economist saying his results only appear after doing 3-stage least squares. I have to think something's not really there if you can't normalize the data somehow and show in a simple graph that the pattern is there (in this case, say, by showing the change in arrest rates for abortion and non-abortion states for the relevant age cohort).

I'm partial to the opposite theory, that abortion would, if anything, increase the proportion of evil-doers: abortion is more common among forward-thinking moms who would be good moms, less common among bad moms who view life as a series of random events that happen to them.

Right. The reason that in his theory of American crime trends, Levitt cites European studies claiming that women who have abortions would make less organized mothers than the ones who went ahead and had their children is because the American studies of who gets an abortion came to the opposite conclusion.

This undermines Levitt's only argument these days about about how abortion would cut cime. (now that Levitt has hushed up his earlier racial eugenic/eucultural argument that because more blacks get abortions and more blacks commit murders, more abortions should mean fewer murders). These Americans studies were pointed out to Levitt by CCNY economist Ted Joyce in his response to Levitt & Donohue in the Journal of Human Resources, which was entitled "Did Legalized Abortion Lower Crime?" Joyce summed up two reason why Levitt's theory didn't work. The second was:

"Second, analysts, I being one, have tended to overestimate the selection effects associated with abortion. A careful examination of studies of pregnancy resolution reveals that women who abort are at lower risk of having children with criminal propensities than women of similar age, race and marital status who instead carried to term. For instance, in an early study of teens in Ventura County, California between 1972 and 1974, researchers demonstrated that pregnant teens with better grades, more completed schooling, and not on public assistance were much more likely to abort than their poorer, less academically oriented counterparts (Leibowitz, Eisen, and Chow 1986).

"Studies based on data from the National Health and Social Life Survey (NHSLS) and the National Longitudinal Survey of Youth (NLSY) make the same point (Michael 2000; Hotz, McElroy, and Sanders 1999). Indeed, Hotz, McElroy, and Sanders (1999) found that teens who abort are similar along observed characteristics to teens that were never pregnant, both of whom differ significantly from pregnant teens that spontaneously abort or carry to term.

"Nor is favorable selection limited to teens. Unmarried women that abort have more completed schooling and higher AFQT [the military's IQ test for applicants for enlistment] scores than their counterparts that carry the pregnancy to term (Powell-Griner and Trent 1987; Currie, Nixon, and Cole 1995).

"In sum, legalized abortion has improved the lives of many women by allowing them to avoid an unwanted birth. I found little evidence to suggest, however, that the legalization of abortion had an appreciable effect on the criminality of subsequent cohorts."


My earlier response to thelatest Freakonomics fiasco is here.


All my blog postings on the controversy can be found at http://www.iSteve.com/abortion.htm

***Permalink***



Abortion and crime: So, Levitt was wrong. But, what actually happened?

Now that Freakonomics author Steven D. Levitt's mishandling of his abortion-crime data has been exposed by economist Christopher Foote, I'd like to review what actually happened in American over those decades.

As I tried to explain to Dr. Levitt when we debated in Slate in 1999, what happened, simplifying greatly, was that the vast youth crack crime wave took off first in later 1980s in the socially liberal states where legal abortion also had taken off first about 17 years earlier, most notably New York and California, which legalized abortion in 1970, three years before Roe v. Wade.

In other words, there was at the state level, a positive correlation (when appropriately weighted by population of state), between the legal abortion rate in the early 1970s and the homicide offending rate in the late 1980s and early 1990s among those youths born after legalization, not the negative correlation asserted by Dr. Levitt. Unfortunately, Dr. Levitt initially only looked at crime rates in the years 1985 and 1997 (and only looked at the overly crude age groups of over and under 25), so he completely missed how his theory had catastrophically failed its most obvious historical test.

Second, and also contrary to Levitt's theory, this vast youth murder wave took off first specifically in the demographic group that had the highest legal abortion rate: urban blacks. The non-white abortion rate peaked in 1977, well before the peak of the white abortion rate. The peak years for homicide among 14-17 year old black males were 1993 and 1994 -- i.e., the cohort born at the peak of the black usage of legal abortion. As Donohue and Levitt wrote in 2001, under their theory, the opposite was supposed to happen:

"“Fertility declines for black women are three times greater than for whites (12 percent compared to 4 percent). Given that homicide rates of black youths are roughly nine times higher than those of white youths, racial differences in the fertility effects of abortion are likely to translate into greater homicide reductions."

Instead, among black males born in the late 1970s, their murder rate among 14-17 year olds was four times higher than among black males born in the late 1960s, before the legalization of abortion. The black to white teen murder rate ratio almost doubled after legalization. So, the Levitt-Donohue theory failed its first two historical tests in a disastrous fashion.

Then, two things happened historically that helped create the (presumably, assuming Foote's new technical critique doesn't completely eliminate it) state-level negative correlation between later 1970s abortion rates and later 1990s crime rates that Levitt and Donohue have emphasized so repeatedly, while trying to cover up the earlier negative correlation. (They imply that the longer the time lag between presumed cause and effect, the more trust we should put in it!)

1. From NY and CA, crack spread to more socially conservative states, where the abortion rate had also gone up later. So crime was higher in the mid to late 1990s in socially conservative states where abortion rates didn't go up until the late 1970s or early 1980s.

2. And, the crack wave burned out first in the places where it started first, most famously New York City.

We've all heard a million arguments about why crime fell in NYC in the 1990s, but an overlooked explanation was brought up by Knight-Ridder reporter Jonathan Tilove recently: there are today in NYC, 36% more black women alive than black men. Nationally, among all races, there are 8% more women than men alive.

Obviously, this gigantic black male shortage in NYC wasn't caused by abortion -- there was virtually no sex selective abortion at the time. No, it was mostly caused by an enormous increase in imprisonment and by the most dangerous black men murdering each other in large quantities in the late 1980s and early 1990s. (AIDS played a role too.) Levitt has never written, as far as I know, about the impact of these "selective post-natal abortions," as it were, on the crime rate, but it's clearly a substantial factor in a number of big cities that were hit hard by crack. (NYC is by no means unique in terms of the current black male shortage.)

Moreover, as I pointed out to Levitt in 1999, and as his deservedly famous chapter in "Freakonomics" on how dealing crack pays so badly confirmed, a lot of the next cohort of urban youths, those born more than a half decade after abortion was legalized in their state, figured out that dealing crack was a stupid career choice. Seeing how their older brothers and cousins were winding up in prisons, wheelchairs, and cemeteries, they became less likely to commit murder. Participating in the crack wars were, for the vast majority of the gangstas, extremely bad life choices, and it's hardly surprising that the later cohort born in the early 1980s did a better job of figuring this out.

But these anti-crime trends in the 1990s happened first where crack happened first, which tended to also be where legal abortion happened first, thus creating the most likely spurious correlation between legal abortion and the crime decline in the later 1990s that Freakonomics focuses upon.

So, for this controversy, the crucial issue is The Burden of Proof. Dr. Levitt has tried hard to hand the burden of proof off to his skeptics, claiming that he's looked at all other possible causes of the 1990s crime decline, and they aren't adequate to explain it, so abortion must be the cause of the remainder. That's a weak and irresponsible argument.

Of course, in reality, he hasn't looked at all the causes -- for example, I've never seen him take into account "selective post-natal abortions" of the most dangerous gangstas by other gangstas, nor the social learning impact on the next cohort of seeing their older brothers die or go to prison.

But, moreover, there's an old saying that large assertions require large evidence. And Levitt's abortion-cut-crime theory is one of the largest assertions in the social sciences in recent years. Clearly, the burden of proof rests on Dr. Levitt.

There's also an old idea in science called Occam's Razor, which more or less says that scientists should be biased toward simplicity in explanations. Throughout this six year controversy, Dr. Levitt has consistently gone for the most complicated, hard-to-understand, and (as we've seen this week, to Dr. Levitt's embarrassment) hard-to-check-up-on statistical models.

In contrast, he's combined statistical incomprehensibility with the most simple-minded behavioral models -- he has repeatedly assumed, despite all the evidence from American studies cited above, that ghetto women decide whether or not to engage in unprotected sex and whether or not get an abortion or have an illegitimate child for the same reasons that would appeal to highly educated women of his own class. While Levitt's style of thinking about how women respond to legalized abortion has proven highly persuasive to the nonfiction book purchasing class, it doesn't explain much at all about the behavior of the class in which potential criminals are typically raised.

Maybe the technical opacity of Dr. Levitt's analysis was necessary -- social phenomena are terribly complicated. But the impact of his behavior on the public and on much of his profession has been to encourage among his numberous fans not a critical engagement with the historical and sociological record, but an attitude of faith, a warm feeling that this really smart guy has Figured It All Out using Really Complicated Statistics and we should just take his word for it.

As a marketing strategy, the oracular approach of "Freakonomics" has been mind-bogglingly successful, but perhaps I may be forgiven for wondering whether it advances the cause of good social science.

All the data cited above can be found documented at http://isteve.com/abortion.htm

***Permalink***


My published articles are archived at iSteve.com -- Steve Sailer


My published articles are archived at iSteve.com -- Steve Sailer

1 comment:

Anonymous said...

I think the part about him not sharing the raw data and models and regressions with Lott is interesting. The errors might have been found much earlier if the methods had been shared. It takes a lot to redo an analysis if all the information (data AND models) is not shared. Often papers are not validated because the descriptions in the text are ambiguous.