November 22, 2012

The (hopefully) ultimate post on Trout v. Cabrera: alternative universes v. what actually happened

I'm sure everybody is sick of the baseball debate over the the American League Most Valuable Player award going to 29-year-old veteran Miguel Cabrera of the Detroit Tigers over 20-year-old wunderkind Mike Trout of the California Angels. But, I've think I've come up with a subtle but useful distinction.

Personally, I would have voted for Trout. But I think I can come up with a better defense of the sportswriters voting for Cabrera than they can.

Ironically, Trout is a classic Five Tool Player that the pre-Moneyball old school scouts would have drooled over because he Looks Good in a Uniform. Cabrera is the kind of pudgy Ken Phelps-like power hitter who whom Bill James drooled over.

But, leave that aside because here's something that I've never really grasped before in all the years I've been thinking about baseball statistics (since 1965 when I was six).

A pervasive distinction between sabermetric statistics and traditional statistics is that the new statistics (such as Wins Above Replacement [WAR], in which Trout did best) are generally intended to predict the future better by removing as much as possible the impact of luck, while the old statistics (such as Runs Batted In [RBI], which favored Cabrera) are intended to describe the past, which includes the impact of luck. MVP awards are handed out based on performance in the season just past, so a case can be made that the backward-looking statistics make sense in MVP voting.

Think of it as the difference between scientists and historians. The former are obsessed with replicability, the latter with narrative.

To illustrate this, compare Cabrera's 2012 season not to Mike Trout's 2012 season, but to Cabrera's own 2011 season. Cabrera has been highly consistent as a hitter over his ten year career, peaking over the last three years.
201128DETAL161688572111197480301052110889.344.448.5861.0331793352430522*3/DAS,MVP-5
201229DETAL16169762210920540044139416698.330.393.606.9991653772830617*5/D3AS,MVP-1,SS

Cabrera actually had a higher WAR in 2011 (7.3) than in 2012 (6.9), but he only finished fifth in the MVP voting a year ago. Why? Because his RBI total in 2011 was only 105, compared to 139 in 2012.

In the 20th Century, the RBI championship notoriously correlated with winning the MVP award, although that connection has faded in this century as the sabermetricians have increasingly had their say.

Sabermetricians have long argued that RBIs are over-emphasized in discerning excellence because they are so context sensitive (you want guys ahead of you in the batting order getting on base, but not hitting homers that clear the bases) and dependent upon luck.

Moreover, past clutch hitting performance seldom accurately predicts future clutch hitting performance. The whole notion of clutch hitting in baseball seems pretty dubious: trying hard in four at bats per day just isn't all that physically or mentally debilitating, so it seems likely that major league baseball players try pretty hard most times they come up to bat. Moreover, the typical major leaguer has come up to bat in clutch situations thousands of times since he was a small boy and if he were inclined to choke when the pressure is on, he probably wouldn't have made it to the majors.

So, maybe Cabrera's relatively low RBI total in 2011 was just bad luck, and regression toward the mean would suggest it was likely to go up in 2012, which it did. 

And, he's likely to drive in fewer than 139 runs in 2013 due to regression toward the mean. Heck, if they replayed the 2012 season in a computer a million times, Cabrera probably wouldn't average 139 RBIs. He had to be lucky in 2012 to drive in that many. Maybe he only "deserved" to drive in, say, 125, and then he wouldn't have won the RBI race and thus wouldn't have won the Triple Crown and probably wouldn't have won the MVP award. You could run a million computer simulations of the season and check this out.

One of Cabrera's sabermetric critics Keith Law of ESPN raised the question of alternative universes, Twittering:
@keithlaw
No. #narrative RT @theknapsackkid: do you think in an alternate universe where Hamilton hits 2 more homers, Cabrera still wins mvp?

Indeed, much of what sabermetricians do is try to estimate what would happen in alternative universes.

But, here's the thing: Cabrera really did drive in 139 runs in 2012. That is what happened in this universe That doesn't mean he was the best player of 2012, or that he would have been the most valuable player if you could average across infinite alternative universes, but it does suggest that he was a really valuable player in this universe.

WAR is slanted toward inputs, while RBIs is a measure of outputs. Famously, one of the inputs valued by sabermetricians is walks. Cabrera only walked 66 times in 2012, down sharply from 108 in 2011. All else being equal, across a million alternative universes, that big decline (which was reflected in his On Base Percentage) is a bad thing. 

But, that decline in walks and on-base percentage was actually part of the Tiger management’s grand strategy. In 2011, Cabrera had batted fourth (clean-up), but hadn’t cleaned up as much as they’d hope because other teams had pitched around him because they weren’t all that afraid of the #5 hitter. Cabrera made the best of this situation where he wasn't getting that many pitches that he could really drive, accepting a lot of walks, hitting 48 doubles (but only 30 homers) and leading the league in On Base Percentage. Sabermetricians love On Base Percentage because in random situations, it's very valuable on average. But the Tiger management didn't think Cabrera was as valuable to them in 2011 as he ought to be because he was walking and doubling too much and homering and driving in runs too little.

The Tigers figured that they weren't really paying Cabrera $21 million to deliver power statistics of 30 homers and 105 RBIs. So, they spent $23 million in 2012 salary to land Prince Fielder so they could move Cabrera up to the #3 spot in the line-up and protect him with a famous home run hitter in the clean-up spot.

Fielder is even fatter than Cabrera, so he would need to play first base. (The Tigers’ designated hitter spot was filled by Delmon Young, who is a complete oaf.) This, by the way, reflects the influence of the sabermetrics revolution of the 20th Century: Cabrera is listed at 240 pounds, Young 240, and Fielder 275. Before Bill James' time, it was rare for a team to put out a lineup with 3 guys who look more like semipro slow-pitch softball players, but the first generation of sabermetricians proved that baseball was overrating elegant defense, baserunning, and line-drive hitting compared to homers and walks. So, now, baseball is full of guys who look like offensive linemen.

So, Cabrera lost weight over the offseason and worked hard on fielding and throwing so he could move back to third base to open up first for the poor-fielding Fielder.

And this strategy worked well. Free to swing away, Cabrera upped his homers from 30 to 44 and his RBIs from 105 to 139. His On Base Percentage dropped from .448 to .393 and his Runs scored from 111 to 109. But, all told, Cabrera delivered exactly what the Tigers had been hoping for.

Now, you could say that if you used your computer to randomly assign Cabrera to a different team, on average in your alternative universe simulations, his 2011 season would be more valuable than his 2012 season. But we don't live in infinite alternative universes, we live in this highly continent single universe.

You can see the difference between an MVP Award and a statistically sound analysis of ability more easily when thinking about World Series MVP Awards. Consider the famous 1986 World Series between the Mets and Red Sox. Out of all the good players on those two teams (Roger Clemens, Gary Carter, Jim Rice, Darryl Strawberry, Dwight Evans, Keith Hernandez, Doc Gooden, Don Baylor, etc.) was World Series MVP Ray Knight really the best one?

Of course not. Indeed, the Mets let their World Series hero go during the offseason. But, he really did have a valuable World Series.

Say a player in the World Series crushes a lot of balls, but most of them right at somebody and winds up batting .231 as his team gets swept (a little bit like Cabrera in 2012 World Series). A statistical system even better than WAR would predict that he would do much better if that World Series were replayed a million times. It might even predict he'd be the MVP more often than anybody else.

But, they don't play the WS a million times, they just play it once, and in World Series that was actually played, Cabrera wasn't the WS MVP.

Conversely, it's not ridiculous to argue that Cabrera was the most valuable player in the AL in the 2012 season, even if Trout was the best.

P.S., Also, there's the Career Achievement aspect: Cabrera is 29 and has come close to the MVP before, finishing in the top 5 five times. He's headed toward the decline phase of a highly respectable career, the kind that usually wins an MVP award.

Trout is only 20 and if he's really as good as he appeared to be in 2012 (i.e, like a mid-career Mickey Mantle), he ought to win several when he's older and even better.

Career Achievement isn't supposed to play a role in MVP voting, but it's reasonable that it does to some extent, especially since the advent of steroids.

In short, 29-year-old Miguel Cabrera has passed more PED tests than 20-year-old Mike Trout has.

That doesn't mean he's clean, but Cabrera's career arc looks reasonable. And that may well be unfair to Trout, but that's the world we live in.

22 comments:

Anonymous said...

(The Tigers’ designated hitter spot was filled by Delmon Young, who is a complete oaf.)

"Oaf" means foolish, not fat. Which did you mean?

cmcoct said...

"Oaf" is the right term for Delmon Young.

From his early-season arrest in NY for drunkenly shouting anti-semitic slurs in the wee hours, to his four month slump when the Tigers were trying to catch the White Sox, to his grand finale: the worst throw in baseball history from left field during the World Series - a 20 footer straight into the ground that rolled 50 feet wide of third base. That throw would have been an embarrassment for any adult male, let alone a MLB player.

Of course, his MLB brother, Dmitri Young, was partially nuts for a few teams, so HBD fans should point out the genetic implications here.

Anonymous said...

Personally, I'm not very interested in this MLB stuff, but the guy being named Trout reminds me of Kilgore Trout.

Happy Thanksgiving Everyone

Anonymous said...

Trout was better, therefore more valuable. You said it.

Don't waste your time coming up with better defenses of the guys whom you know are wrong. What's next, coming up with a spirited defense of the values of a diverse student body in higher education?

sunbeam said...

I've never been much of a baseball fan, but I'm not getting this hullabaloo about Sabremetrics.

Okay, I get the gist of the story, "sexy rebels" (cough), stick it to the man and upend the system in the name of truth. It's a meme, and you can sell a book, and make a movie, which is the important thing after all.

But these guys think WAY too small.

A couple of years ago I read some articles about software that got oodles of data about chemical reactions. The program spit out equations describing chemical reactions that appeared to be correct, that the experimenters had no idea what to do with exactly.

Similar software (if not the same; I'm not spending all morning reading this) derived Newton's Laws of Motion from experimental data.

Here is a link to where I think I read of this first:

http://www.wired.com/wiredscience/2009/04/newtonai/

The title of the page, from 2009, is "Computer Program Self-Discovers Laws of Physics."

I urge you to read this, it's quite fascinating the implications are pretty far reaching.

Hell, you've got 100 and some change worth of years of data to put into it, shove it in and see what it kicks out.

The Gatekeepers of this probably would do it for shits and giggles, plus it would probably get them at least a minor article in a mainstream media and an attaboy from their SWPL peers.

Actually the software is freely available (at least the source code), so if you are comfortable with Open Source and knew stats it shouldn't be too much trouble to whip one up.

There is another way related to computing in which these guys think too small. With computers you can crunch a lot more data. You can crunch it for ALL CASES, and I mean every one, every last at bat Reggie Jackson or anyone else ever made in any situation, ever.

You can put in humidity, temperature, time of year, time of day, runners on base, who was managing for each side, who the paricular pitcher was. If the data existed you could track which pitch the pitcher made against each batter.

These guys seem to me like they are going a little further than the guys that went before, but don't seem to realize they could go a heck of a lot farther without too much trouble.

If I owned a major league club, I think I would keep a lot more records than just the normal stats teams keep. I'd keep which pitch was thrown, the velocity, etc.

Actually I think I'd hire a couple of starving grad student to whip me up a machine vision system to id the pitches and give me lots more data than a radar gun gives. Nothing really stops me from getting accelerations, release points, release velocity, arc, decay of velocity, rotation numbers, etc.

Something similar could give me real bat speed, and the arc the batter uses.

So in essence, these guys, including Nate Silver (how did this guy get to be famous?) need to listen to that Daft Punk song, "Better, Faster, Stronger, Harder."

Anonymous said...

Oaf means an a$$hole, just like his brother Dmitri.

Walks? Similarly, look at David Ortiz's walk totals with and without Manny Ramirez. How can you take seriously a statistic that depends a whole lot on who is hitting behind you.

countenance said...

Steve, from how you describe them, sabermetricians seem to be the same kind of dorks as are people who fan over alternate history. Of course, alternate history is snake oil because there are just way too many variables in real history to say for sure that this is how things would have turned out changing only one past variable.

Anonymous said...

When will the world ever stop discriminating against whites?

When, when when?

Piraeus said...

You're wrong about WAR. It is not predictive, it is descriptive.

Anon87 said...

I would have thought that Trout could bridge the gap between the Holy War of Stats vs. Scouts. But the arbitrary allure of winning 3 statistical categories was preferred by the "old school" writers, who as always make up their reasons for voting after they decided who to pick. They are paid to get people to read their articles after all, so a unanimous vote for Trout is boring and won't keep you employed when you can cause ESPN-bait "controversy".

Here is the actual criteria as defined by the BBWAA:

Dear Voter:

There is no clear-cut definition of what Most Valuable means. It is up to the individual voter to decide who was the Most Valuable Player in each league to his team. The MVP need not come from a division winner or other playoff qualifier.

The rules of the voting remain the same as they were written on the first ballot in 1931:

1. Actual value of a player to his team, that is, strength of offense and defense.

2. Number of games played.

3. General character, disposition, loyalty and effort.

4. Former winners are eligible.

5. Members of the committee may vote for more than one member of a team.

You are also urged to give serious consideration to all your selections, from 1 to 10. A 10th-place vote can influence the outcome of an election. You must fill in all 10 places on your ballot. Only regular-season performances are to be taken into consideration.

Keep in mind that all players are eligible for MVP, including pitchers and designated hitters.


1. Trout (defense especially)
2. Cabrera
3. Trout (going by character, based upon Cabrera's issues with the bottle)
4. Push
5. Push

I'd weight the first the most, considering this has most to do with actually playing baseball, and actually happened on Earth in 2012 not in a parallel universe. I have a hard time putting value in the opinion of anyone who chose Cabrera. If the Triple Crown is so impressive, I'm sure I can find 3 stats of Trout's that have occurred less often than leading AVG/HR/RBI, if ever. "First 20 year old to have X number of this and Y number of that and leads the league with Z". I wouldn't base my MVP vote on that though.

Maybe we just like to argue, like everyone does over politics. And speaking of.....Happy Thanksgiving! Keep it clean at the dinner table.

Maguro said...

The main reason Cabrera had a lot more RBIs this year is that Austin Jackson, the Tigers leadoff hitter, got on base a lot more and provided Cabrera with lot more RBI opprtunities that he cashed in at about the same rate he slways has.

In 2011, Tiger leadoff hitters had a .311 OBP and scored 101 runs. In 2012, Tiger leadoff hitters had a .364 OBP and scored 123 runs. This alone accounts for most of the 35 RBI difference between Cabrera's 2011 and 2012 campaigns.

This points to the problem with RBIs in general - you're giving an individual credit for what amounts to a team performance. Basically, the voters gave Cabrera the MVP because Austin Jackson got a lot better at hitting.

Maguro said...

I've always found it interesting that RBIs have such an unbelievable amount sportswriter mystique compared to runs scored when they're really two sides of the same coin. No one rides a big runs scored total to the MVP award.

Anonymous said...

There is a sabermetric statistic that measures how much a player contributed to their team within the confines of a season. It's called win probability added (WPA).

It measures the difference in historical win probability between static states. Let's pretend that the visiting team down a run to start the 8th inning wins 33% of the time, and the visiting team down a run with one out in the eighth with none on wins 30% of the time. If a batter on the aforementioned visiting team struck out to lead off the eighth, they'd be assigned -0.03 WPA for that at bat.

Here are the 2012 leaders.

Trout still comes out on top with an MLB best 5.32 WPA. Cabrera was fifth (4.82) and not even the best player on his team as judged by WPA, with Prince Fielder finishing third (4.93).

WPA isn't the end all be all, but it's a better measure of what you're trying to get at than RBI.

Jack said...

I don't follow baseball that much, but I do recall the Tigers were in the World Series, and the Angels weren't. Isn't this as good an argument as any?

josh said...

Steve,

If you like "what actually happened" stats, Fan Graphs has a great one. If you've ever watched Poker on TV you remember how the odds of winning a hand change after each new card is revealed. A similar stat for baseball calculates the odds of a team winning before and after a players at bat (based on the historical winning percentage for say, home team down by two with two outs and a runner on second in the eight inning). Add up the difference between the before and after for every at bat and you will have a seasons worth expected wins added above average. This is an offensive stat only, but it accounts more directly for clutchness than RBI. A bottom of the ninth three run homer when down two is worth more than a three run homer with a five run lead in the 5th.

Interestingly both Trout and Prince Fielder out "Win Probablity added" Cabrera.

http://www.fangraphs.com/leaders.aspx?pos=all&stats=bat&lg=all&qual=y&type=3&season=2012&month=0&season1=2012&ind=0&team=0&rost=0&age=0&filter=&players=0

Anon87 said...

Jack,

The Tigers won 88 games. The Angels won..............89 games. Cabrera shouldn't get credit for playing in a worse division.

Anonymous said...

Jack,
No, that is a stupid reason. First, it's a regular season award voted on before the playoffs. Second, the Angels won more teams in more difficult division than the Tigers, so it's debatable they were even the better team.

Both had great offensive seasons, with Trout slightly better at grtting on base and slightly worse in slugging (a gap that goes away if you look at ballpark effects). Trout was an elite defender at a premium position, an elite base runner, while Cabrera was bottom of the barrel. It's not really a discussion if there's no triple crown, as the Keith Law tweet notes.

rwcg said...

It's been hilarious watching the self-anointed saberkidz complain how dumb it is to slave an award to transparent counting statistics like 'home runs' and 'RBI' by advocating that we...instead slave these awards to an opaque black-box statistic that virtually no one stirring up this ginned-up 'controversy' is capable of replicating!

Yes, that's so much 'smarter'.

Make no mistake, Trout had a very nice Rickey Henderson-like year. By my count this 'WAR' stat gave him some 4-5 'wins' for achievements related to his running speed. It's okay to be a skeptic of this. Or at least, it should be.

My vote goes to Cabrera.

SomeRandomGuy said...

@Anon87,

How can you knock Cabrera for his defense and actually say Trout was more valuable?

It may be no question which one is the better defender, but this is like getting Adam Dunn and then complaining he strikes out too much.

Defense isn't a part of Cabrera's skill set, unlike Trout, yet he still managed to play a passable 3rd base for a team in NEED of a third basemen without complaint or diva-ish behavior. After getting hit in the face in spring training, he could have easily gone to management and demanded the big star treatment of putting him at 1st and having Fielder DH; instead, he worked awfully hard to give them the best defense possible.

Trout, other the other hand, for all his defensive skills (and again, he really seems to be a true five-tool Mickey Mantle type player) was a BONUS to an Angels team that already had a player who could not only play center field at a very high level; but was often used as a defensive replacement in center late in games.

He may be a world-class defender; but for this year, for his team, his defensive value has been highly overstated or shouldn't be some sort of factor in this race.

As for Mr.Sailer- I think this is one of the better defenses of Miguel Cabrera I've read; what it comes down to, for me, is that statistically, you have two players who are too close to be almost indistinguishable. Unless there are metrics used to isolate, for example, say, walks and the factor of Prince Fielder hitting behind Cabrera or Trout hitting in the 7th,8th,leadoff spots....you got yourself a virtual tie with the sabermetrics guys crapping themselves over the fact that people aren't totally convinced that they've "done the math" (even though I sometimes get the impression that they don't all know how "the math" applies to certain situations; ) while you have some "old school guys" who are so obstinate that they refuse to even consider anyone who is associated with sabermetrics, as if their egos can't take that their narration heavy eyes aren't the only way of doing things.

I don't know which one is more obnoxious; but this is really becoming a tedious argument.

Steve Sailer said...

"There is a sabermetric statistic that measures how much a player contributed to their team within the confines of a season. It's called win probability added (WPA)."

Thanks.

It's interesting that that history-oriented stat doesn't show up in these debates, however, compared to the science-oriented WAR.

Anon87 said...

SomeRandomGuy

Hey, I love Adam Dunn. I think he's unfairly been the whipping boy for various factions through the years, he's the poster boy Three True Outcome guy, and it seems he has a genuine sense of humor (not the fake look-at-me humor of Brian Wilson), but his defense is terrible and I can't overlook that. Even if "he worked awfully hard" to give his team the best defense possible, like you say Cabrera did. It doesn't change the fact that they both stink with the glove.

I don't put a ton of stock in the latest defensive metrics, so it may be possible Cabrera is being treated harshly be UZR, WAR, etc. etc. but it's obvious he brings nowhere near the VALUE that Trout does defensively. Are you actually saying Trout is a world class defender but that shouldn't be a factor in this race? Defensive is specifically called out when making a MVP vote! You can't just hand wave away the huge advantage Trout has defensively or with baserunning. Trout and Cabrera are not too close to be indistinguisable; Trout has more value. Stats and scouts agree to that, but somehow sportswriters and apparently many others don't.

Brian Kenny on MVP vote

Anonymous said...

Sometimes what is important cannot be counted whereas what is unimportant can.

In other words there is a reason
~Mister October~ was not Mister September and the Broncos got the better pick with Tebow over the Raiders taking Russell.