November 20, 2009

Predicting baseball performance

In the 1970s and 1980s, Bill James put a lot of effort into predicting how well young players would do. In 1988, he summed up 15 things he'd learned, and three of them related to forecasting young players' development:
  1. Minor league batting statistics will predict major league batting performance with essentially the same reliability as previous major league statistics.
  2. Players taken in the June draft coming out of college (or with at least two years of college) perform dramatically better than players drafted out of high school.
  3. The chance of getting a good player with a high draft pick is substantial enough that it is clearly a disastrous strategy to give up a first round draft choice to sign a mediocre free agent.
James essentially found, unsurprisingly, that the closer players got to the majors, the easier it is to predict their major league performance. Minor league hitters can be predicted reasonably well from statistics alone. Drafting college players was usually safer than drafting high school players. High draft pick high school pitchers, I believe, were especially likely to flame out.

Partly this effect was maturity and injuries, but, I imagine, it also had to do with the usefulness of college statistics v. high school statistics. If a prospect hits .350 for the Rice Owls, you can conveniently look up how that compares to what Lance Berkman hit at Rice (.385) and what Berkman is hitting in the majors (.299). But if the prospect hits .500 for Horace Mann High School, which hasn't sent anybody past Rooke League ball in decades, how do you know how good that number is?

One problem with predicting quarterbacks' performance relative to other kinds of athletes is that, typically, only one gets to play at a time. Baseball teams have five starting pitchers. Baseball hitters can typically get squeezed in for a look at multiple positions. Football running backs generally substitute in and out so that they get a breather. But second and third string quarterbacks can get stuck for years with very little opportunity in games to show what they can do. If Joe Montana had been as durable as Peyton Manning, Steve Young might have been a career backup.

Kurt Warner didn't get to start until his senior year in college, then went undrafted. He was invited to the Green Bay Packers camp, where he competed for a job with Brett Favre, Mark Brunell, and Heisman-winner Ty Detmer. Not surprisingly, he wasn't as ready for the NFL as those guys. It took him four years stocking shelves, playing Arena football, and for the Amsterdam Admirals to make it to the NFL. By his second year in the NFL, he was the MVP.

Because the quarterback is so central to the offense, quarterback changes are a big deal, hashed out endlessly on sports talk radio. Moreover, because coaches don't like to change quarterbacks, starters play banged up a lot. If a big league pitcher is at 85% physically, so that his fastball drops from 93 to 79 mph, he's out of the rotation until he gets better. But if a quarterback is at 85% physically, he probably continues to start to maintain team continuity. This means his per play performance (which Berri measured) drops.

This also means an overlooked quarterback can be healthier than a heralded starter of the same age, boosting his per play performance when he finally gets in the game. For example, Matt Cassell didn't start a game between his senior year in high school in 2000 (?) and his taking over successfully for the injured Tom Brady in 2008. (He was stuck behind two Heisman trophy winning quarterbacks at USC.) That's a lot of punishment he didn't absorb.

My published articles are archived at -- Steve Sailer


Simon Oliver Lockwood said...

How did Cassel even get drafted despite playing nothing but garbage time in college? He must have done great at the Combine.

Chuck said...

So what this would imply is that quarterbacks are allocated less efficiently than other positions or players in other sports. Its much less a free market operation.

This being the case, many non-football related qualities are observed in guys who try out for quarterback; much more than in other sports. While other positions and sports do analyze the character and quality of the players, QBs are subject to analysis of other factors. Ricky Williams smoked a lot of pot, but he's still in the league. Lots of other players get into trouble with the law, but besides Michael Vick, most quarterbacks are more straight and narrow and exhibit better leadership and character.

Body language and alpha leadership capabilities are important. A boy is sifted out for that role in high school. His QB demeanor sticks with him through college.

Anonymous said...

One of Malcolm's sources responds to Pinker in his blog:

No mention of you, but you might like to know about it.

Couchscientist said...

You're future stolen generation prediction is making you look more like a prophet everyday:

TechBlogger said...

i think you're guilty of a little "projecting" there, Steve.

You see, EVERYTHING *YOU* write is about race.

This lead you to believe that the football debate with Malcolm is a proxy for race.

Maybe, maybe not.

but you hypocritical accuse blacks of being all about race when that is exactly what you are about.

John Seiler said...

One way Bill James "proved" his theories was by getting hired by the Red Sox and ridding the team of the "Curse of the Bambino."

Malcolm could do the same by signing on with Detroit, which has a curse of its own: "The Curse of Bobby Layne." When Detroit got rid of the immortal QB in 1958, after he had brought them 3 championships and owned most QB records in the book, he guaranteed they "wouldn't win a championship for 50 years." They haven't.

(By the way, Layne was the 3rd pick in the 1948 draft.)

Detroit also hasn't had a Pro Bowl QB since Greg Landry in 1972.

Detroit, 0-16 last year and with just one win so far this year, obviously could use Malcolm's talents.

David Davenport said...

... Moving to Western Africa, the professors say, could be just what’s needed for some children at risk of getting caught up in gangs or violence. They would see the world, get away from bad influences and be in a controlled setting focused on academics.

via The Indianapolis Star

I have a proposal of my own: penal colonies in Afghanistan for American felons. Prisons in the USa are too expensive.

David Davnport said...


To correct for this bias, we focused on per-play statistics. And here is a sample of what we found. After a quarterback has played five seasons in the NFL (minimum 500 career plays), here are the correlation coefficients between draft position and various career statistics:

Completion Percentage: -0.01

Passing Yards per Pass Attempt: -0.02

Touchdowns per Pass Attempt: -0.12

Interceptions per Pass Attempt: 0.00

QB Score per Play: -0.01

Net Points per Play: -0.02

Wins per Play: -0.02

QB Rating: -0.06

Our data set runs from 1970 to 2007 (adjustments were made for how performance changed over time). We also looked at career performance after 2, 3, 4, 6, 7, and 8 years. In addition, we also looked at what a player did in each year from 1 to 10. And with each data set our story looks essentially the same. The above stats are not really correlated with draft position.

Those correlations approaching zero are fishy. I tend not to believe 'em.

Maybe dberri could explain his methodology here?

By chance, is Mr. Berri associated with the global warming scientists at U. East Anglia?

Dan said...

"Drafting college players was usually safer than drafting high school players."

The number of high school players drafted relative to college players in the first round was strangely high in the early years of the draft (say from 1965-1980). (I wonder if a large part of that was just a bias old-timers and scouts had against college players in those days, i.e. they couldn't mold them as well into big leaguers, or the college guys were seen as not serious enough about baseball if they chose college over pro out of high school, etc.) By 1984, when James analyzed the draft, this had largely already changed to about what's been the case since (a slight majority of college players taken in the first round, probably an average of about 17/30 over the past 25 years).

"High draft pick high school pitchers, I believe, were especially likely to flame out."

My hunch is that this is largely a myth. Pitchers in general are a bit more high risk/high reward, but if anything, more high-profile college pitchers drafted high (like top 10-ish I'm thinking) have "flamed out" (or at least had very short stays in the bigs) in the last 20 years.

Of what I'd consider (roughly) the 18 best pitchers in the majors now (in terms of who'd be drafted highest in fantasy, more or less) that come to mind, a majority of them (10) were not only picked in the first round of the draft (top 30), but OUT OF HIGH SCHOOL. That's amazing I think. These 10 are: Greinke, Wainwright, Beckett, Hamels, Cain, Billingsley, Halladay, Sabathia, Kershaw, Carpenter.

If we exclude Felix Hernandez and Johan Santana (non-US so not draftable) we are really dealing with a sample of 16 and not 18. Of my remaining six, Verlander and Lincecum were first round out of college, Haren was 2nd round out of college, and Peavy, Lee, and Webb were lower round picks.

"it also had to do with the usefulness of college statistics v. high school statistics. If a prospect hits .350 for the Rice Owls, you can conveniently look up how that compares to what Lance Berkman hit at Rice (.385) and what Berkman is hitting in the majors (.299)."

Still, compared to minor league stats, I'd say college stats have a much lower correlation with big league stats. Just because of the metal bats and the variation between players of being able to adjust. Lots of the heaviest college hitters have no future in the pros and all the scouts know it at the time.