Newsweek - National News, World News, Health, Technology, Entertainment and more... | Newsweek.com
Full Post
Posted Thursday, July 10, 2008 5:46 PM

Juiced: Guilt by Graph?

Sharon Begley

Here’s one of those phrases that The New Yorker would label as “sentences we never read past”:

 

"I was skimming the program for the annual meeting of the American Statistical Association . . ."

 

But really, where else can you find not only research on “Modeling Sparse Generalized Longitudinal Observations with Latent Gaussian Processes” but also on managerial strategies in baseball, parity in the NFL and the accuracy of sports predictions? It’s striking how many statisticians who study weighty matters—how to tell if a cancer drug works or a compound is dangerous—got their start studying sports statistics.

 

“A lot of us really enjoyed baseball statistics when we were growing up, and that’s how we got into the field,” biostatistician Michael Schell of the Moffitt Cancer Center in Tampa told me.

 

So I got in touch with Jack O’Gara, who wrote the book on using statistical techniques to spot chicanery in business (that would be the 2004 “Corporate Fraud: Case Studies in Detection and Analysis”). Now retired, O’Gara has put his statistical skills to use analyzing baseball, especially cheating.

 

In the business world, he focused on what he calls inflection points, a sudden discontinuity in data. That is what he saw, galore, when he analyzed the career stats of pitcher Roger Clemens.

 

Clemens, of course, was named in the Mitchell Report, which last December reported that an alarming number of baseball players had taken performance-enhancing drugs such as steroids. (Clemens' section starts on p. 215.) Clemens and his camp deny it. O’Gara decided to see if stats could tell us anything.

 

One of the most telling is ERA Margin, which compares a pitcher’s earned run average in a given year to the league average. It’s more informative than ERA alone because it controls for weird things like hitters league-wide being in a slump (which would reduce every pitcher’s ERA but not ERA Margin), or the use of a juiced ball that year, which would raise pitchers’ ERAs but, again, not the margin. The ERA Margin tells you how one hurler is doing compared to his peers.

 

O’Gara compared Clemens’ ERA Margins to those of the 20 post-World War II pitchers with the most wins, turned in by legends such as Warren Spahn, Tom Seaver and Bob Gibson. Through age 34, Clemens’ margin was 1.09, notably better than the others’ 0.6. Fine, the guy was an ace.

 

But from age 35 to 40, when most pitchers fade, Clemens’ margin was 1.18, compared to 0.43 for the other greats. Here's where it gets weird: from age 41 to 45, it was 1.30, while the others’ was a negative 0.01. That is, the other great pitchers’ margin shrank as they got older, falling more in line with the league average and normal aging patterns, but Clemens’ soared. As O’Gara put it, “Clemens is the only pitcher who gets progressively better as he ages into the post-40 category.”

 

When the ERA Margins for baseball’s top 10 or top 20 pitchers each year is graphed, Clemens is better than the rest when he was 29 and 30, then twice more—three performance peaks while none of the top 20 had more than two. “More significantly, the second two peaks were higher than the initial peak, which occurred in the presumed prime of his life, contrary to normal aging patterns,” O’Gara says. “At age 43, Clemens had the seventh-best season [measured by ERA Margin] since World War II.”

 

Of the 20 best ERA Margins since 1945, all came when the pitcher was 34 or younger (average age: 28), with the exception of Clemens, who did it when he was 35 and again when he was 43. The best two-year average ERA Margins cluster when pitchers were in their late 20s (Sandy Koufax: 29-30; Greg Maddux: 28-29), and again Clemens’ best coming when he was 43-44 stands out. Clemens’ ERA margin at age 43 was the best in the majors that year and the best-ever for a 43-year-old.

 

Testimony taken for the Mitchell Report and given to Congress this spring included accusations from a trainer that he injected Clemens, which the pitcher denies. As it happens, the three periods when the trainer said he administered shots “correspond to performance bursts by Clemens,” says O’Gara. “The ERA for these three periods totaled 1.92 over 183 innings, significantly better than his career average ERA of 3.12.”

 

As has been widely reported, in 1996 Clemens, then 34, was coming off a sub-par 1995 season and struggling through the first months of the '96 season, his last of 14 with Boston. “Then he suddenly went from being mired in the worst multiple year performance of his career (the preceding one and 2/3 years) to his best two-year-plus performance of his career,” says O’Gara. “He averaged a 2.91 ERA margin for the remainder of 1996, better than for any single calendar year.”

 

One baseball statistician I asked about this analysis warned me against “guilt by graph”—that is, concluding that someone was juiced based on stats alone. “Stats can tell you if someone’s performance is unusual, but by definition a great player has an unusual performance,” he said. See, for instance, this post by another stats guru.

So in Clemens’ case, do the stats lie—or expose a lie?

You must be a registered user to comment.  Click here to register.  Already a user?  Click here to login.

Member Comments

Posted By: BulldogJack (July 30, 2008 at 3:59 PM)

Posted by Jack OGara:

Sorry, I've been on travel and am only now getting back to the site.

In response to Mr. Birmbaum's article that was posted as acounterpoint to my thesis, let me point out that I had stated, “Clemens is the only pitcher who gets progressively better as he ages into the post-40 category.”  

In his rebuttal of an article prepared for the New York Times by four Wharton professors, sabermetrican Phil Birnbaum stated, ‘There are three pitchers with similar career trajectories, and nobody is saying ‘they’ took steroids.”  These three are Curt Schilling, Randy Johnson, and Nolan Ryan.

My point was the issue of getting progressively better as the pitcher ages into the post-40 category.  None of his three came close to achieving this.  

Mr. Schilling’s career appears to be over at 40; and Mr. Johnson’s ERA Margins have fallen markedly – from an outstanding average of 1.57 for ages 35 to 40 period to .06 after age 40 (through 2007).  Mr. Ryan averaged .52 from ages 35 to 40 and .49 thereafter – quite good, but not progressively better.  

Mr. Clemens averaged an outstanding 1.18 from ages 35 through 40 – and then raised it to 1.30 for the five-year period following.

Mr. Birnbaum says, “And the answer is:  if you acknowledge that Schilling, Ryan, and Johnson have roughly a similar career trajectory as Clemens, and you believed that none of them took steroids, then … your first estimate of the probability Clemens cheated should be approximately ‘zero’.”

Obviously, I do not think his threesome have similar post-40 trajectories.  As to the latter condition, my vote is still out.

Thanks,  Jack OGara  July 30, 2008


Posted By: hop2171 (July 12, 2008 at 5:08 PM)

Well bucsfan01 the point here that you failed to grasp is in 1997 he never even KNEW McLiar and put up one of the greatest seasons of all time (pitching triple crown) yet i am supposed to believe the next year (as McLiar claims) he decided he wasn't good enough in "97 and needed a helper. Yeah okay believe that one and i got a bridge for sale! Now the year you are talking about was 2004 when he had a 1.87 ERA, well there are a few things you left out there kid so i will correct you and educate your sorry a$$. A. he was mainly a 6 inning pitcher that year, he was no longer going deep into games like he once did B. If he posted that 1.87 ERA in the AL i would agree something was unusual but lets be honest, the NL suks and they have no hitting period. The biggest bat in the Central was Pujols in St Louis, after him who did he have to face most of the time? If he pitched in the AL that same season his ERA would have been around 3.50-3.80, yeah thats how much better the AL is than the NL. Educate yourself before calling me out next time or i will leave you in the dust again.


Posted By: 1poohbear (July 11, 2008 at 6:55 PM)

I don't know if the man is guilty or not.  I would love to think he didn't use steroids or HGH or anything else because I really enjoyed watching him pitch while he was in Houston--simply, I don't want that joy to be tarnished.  By the same token, he has not made it easy to believe him.

Now, here's the thing:  As hard as it may be to put aside the emotional aspect of this story, the man is innocent until proven guilty in the eyes of the law.

And another thing, I wonder if O'Gara normalized Clemens' ERA for the fact he posted his best ever number in a year when the Astros were one of the best defensive teams in the Majors and most of the NL Central teams were struggling to score runs.  I couldn't tell you off hand which teams Clemens faced that year.  But I would just about bet that if you took any pitcher that was dominate against the high powered, run scoring AL teams and put him on a very good defensive NL team in a division where offense is not exactly a strength, he's probably going to allow fewer runs over the course of the year than he did playing for the Yankees or Red Sox.  It is a team game after all.

Now don't go slamming me because I only looked at one year.  I have no idea what would happen if you normalized every pitcher's ERA's to account for their teammates and opponents play.  My point is be careful how much you read into statistics.  Numbers can be sliced and diced to say what ever you want them to say and rarely can a single stat tell the whole story.


 
The Peek
 
 
MEDIA

Just a year after buying The Wall Street Journal, the press rapscallion has revitalized the fusty paper.

Sponsored by
 
 
 
 
Sponsored by
 
 
 
loadingLoading Menu