My Photo
Blog powered by Typepad

February 2018

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28      

« One reason I don’t want anyone pulling the plug on me | Main | Europe will survive, but so will its nations »

June 03, 2005


jj mollo

From these numbers, assuming a uniform distribution of home runs per year, calculating a T distribution with 6 degrees of freedom, there is more than 20% chance of a result occurring as far from the mean as 5,006. In other words, it's not that unusual.

In the seven years specified, the average is 5,351. If the number falls below 4,750 then it would be safe to say there was a significant effect.

There is, by the way, virtually no trend in the data for the 7 years displayed.

jj mollo

Here's an interesting article from the Boston Globe which tries to compare historical hitting results by making statistical corrections for era, stadium and talent pool. This is from April, but it concludes there is no steroid effect. I'm not sure whether he tries to make adjustments for weather. From what you're saying, production should be down on cooler years.

(sorry: also posted by mistake on "suckers" thread)

Frank Warner

Two more points, JJ:

1. Yes, there is no trend from 1998 to 2004. That's because those home run totals reflect hitting under the influence of steroids. The fluctuations can be attributed principally to random luck. But the totals always are over 5,000.

2. Dropping under 5,000 would have some significance, particularly since we can assume that several Major League Baseball players still are using steroids. The spotlight might be on them, but the penalties remain small.


It looks similar to the numbers for the warmest years in the past ten years. The trend isn't there (yet). Too small a sample.


And remember the latest years, and certainly this year, don't include McGwire and now Bonds. The big hitters are sidelined. Sammy Sosa's runs went down when he squabbled with the Cubs management. My performance always went down when mamagement messed with me. I forget even where Sosa is now, but I'm sure it's an adjustment if he's changed teams and it will take time to get back in form, if he ever does agin. He's not the man/hitter he used to be, steroids or not.

jj mollo


I was just talking about the statistical significance, i.e. 5% likelihood or less, of a number that distance from the mean. We could expect that kind of deviation to occur every 20 years without any specific cause. It was only based on your 7 data points. It would take a number that extreme for a statistician to conclude that the new number was the result of a new cause(s).

By the way, where do you get your numbers? It's really very interesting.

jj mollo

5,950 would be the significant cutoff on the high side. That is, there is a 95% confidence that the number of home runs will fall within an interval of 4,750 and 5,950 on any given year, given the data you supplied and no changes in the underlying distribution.

Frank Warner

I'm getting my numbers from all sorts of sites, and I've plugged them in to Excel to work out percentages.

I might be off a run or two here and there, but the numbers are pretty good. Last year's numbers, till June 5, I got from newspaper microfilm for June 6, 2004. Sunday newspapers have home run totals.

And by the way, Mark McGwire is retired. His absence is normal. Barry Bonds' absence is perfect. I'm looking for the effects of reduced steroid use in Major League Baseball.


And where do you think those big numbers in previous years came from when McGwire was hitting home runs? Is that how you use things statistically-- just dump a major home run hitter out of the previoys years?


And don't you think that home run hitter Bonds being absent doesn't skew this year's numbers? Of course it does.

And again, Sammy Sosa isn't hitting worth a flip either. But I doubt that because he is or isn't using steroids.

Frank Warner

Baseball players get hurt every year. It's part of the game, and generally evens out.

Major League Baseball is 242 home runs behind last year's pace. I doubt Bonds and Sosa would have hit 242 home runs by now.


>> generally evens out.

There's a bad statistical statement. We're talking about three of the guys who are aberrations and would be points on a curbe or a line that would be thrown out of meaningful statistics. Can you name the three big hitters who were dominant right before McGwire, Sosa, and Bonds dominated in the 10 years before they did. They are aberrations.

And when they were hitting, I would venture to say theu inspired other hitters to try for the seats. A little psychological boost by the big hitters to the wanna-bes is a great motivator. That motivator is not as big in today's environment.

And don't discount the umpires making the strike zone smaller in specific years. They do that too often. I wish they were consistent year to year, but they're npt.

jj mollo

Well, a normal distribution assumes that there are an extremely large number of small effects which are random in nature from year to year. McGwire was special, but he wasn't really that much different than the big homerun hitters in other years. There are a lot of sluggers in MLB. There are new batters coming up every year who hit some home runs themselves.

There are enough random differences from year to year -- managers leaving, slumping stars, heat waves, intestinal flu, big pitchers changing leagues, new stadiums, expansion teams, lighting changes, funny bats, scandals, wars, cicada infestations -- that the variance is high, but it's hard to predict which way it's going to go. It's perfectly reasonable to say that this year is the same as any other.

Frank is making a bold prediction that this year is going to be different to an out of the ordinary extent because of a specific cause. OK, maybe he's not 100% behind his prediction. I'm just giving him the parameters. If the number of homeruns at the end of the year is between 4,750 and 5,950, then it's safe to say, by traditional statistical standards, that nothing important has changed. The background noise has drowned out any clear effect.

Frank Warner

JJ, could you show me the calculations you used to get that spread? It makes sense. I just haven't done statistics -- real statistics -- in a while.

It does look as if this year's home run total will end up "in the ballpark" of reasonable expectations. Nevertheless, it also appears the total will be the fewest ever for a 30-team season.

One season isn't a trend, I know. But I predict it will be the start of a downward trend if Major League Baseball imposes its proposed penalties for steroid use, or if Congress itself imposes the penalties.

jj mollo

Frank, I sent you a spreadsheet attachment.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name and email address are required. Email address will not be displayed with the comment.)