Wednesday, July 04, 2007

When Data and Decisions Don't Match--Little League Baseball

Maybe it's because I used to pitch in Little League when I was a kid, but this article in the July 1 Union Tribune really struck me. It describes how injuries to Little League pitchers has increased significantly over the past 10 years from one a week to 3-4 a day with elbow and/or shoulder injuries from baseball. What's the cause? Apparently, as the article indicates, it is from "overuse" (i.e., pitchers pitching too much). And here is the key statistic:
young pitchers who pitch more than 8 months a year are 5 times as likely to need surgery as those who pitch 5 1/2 months a year.

In San Diego, where I'm located, this can be a big problem because there is baseball going on all year round (even in Little League, where there are summer and fall leagues, plus the ever-present year-round traveling teams).

So what's the solution? A year ago or so they instituted an 85 pitch limit per game. Now, this may a good thing to do, but I have great difficulty seeing a direct connection. Here's why.

With any decision inferences (classification), there are two questions to be asked:
1) what patterns are related to the outcome of interest
2) are there differences between patterns related to the outcome of interest and those related to another outcome?

Here's my problem: I have seen no data (in the article) to indicate that pitchers today throw more pitches than boys did 10 years ago. And I see no evidence in particular that boys today throw more than 85 pitches more frequently that boys did 10 years ago. If this isn't the case, then why would the new limit have any effect at all? It can only be due to a cause that is not directly addressed here. If by limited pitches in a game (and therefore in any given week), the boys throw fewer pitches in a year, there might be an effect.

But based on the evidence that is known and not speculation, wouldn't it make more sense to limit pitchers to five months of pitching per calendar year? That after all has direct empirical evidence of tangible results.

I see this happen in the business world as well, where despite empirical evidence that indicate "Procedure A", the decision makers go with "Procedure B" for a variety of reasons unrelated to the data. And sometimes there is good reason to do so despite the data, but at least we should know that in these cases we are ignoring the data.

I suspect one reason this strikes me is that I used to pitch on traveling teams in my Little League years, back before one cared about pitch counts (30+ years ago). I'm sure I pitched games well over 85, and probably 100+ pitches on a regular basis. One difference was that I lived in New England where you were fortunate to play March through August, and so we all had a good period of time to recover.


Anonymous said...

After a few Google searches, it appears that Pitch Count is a Big Deal in Little League Baseball.

I didn’t find the study the article referenced, but I did find a link to several studies in medical journals that studied the relationship between pitch counts and pitch types to injuries.

The abstract I read concluded that “There was a significant association between the number of pitches thrown in a game and during the season and the rate of elbow pain and shoulder pain.” (n=476). And joint pain is thought to be an early sign of injury.

So the article is positing this relationship:

pitch type/number of pitches thrown -> joint pain -> injury

Note also that the pitch-count rule is based on the pitcher’s age, where (I suppose) age is being used as a proxy to muscular/skeletal development.

Dean Abbott said...

Thanks for contributing the article--very interesting. I'll have to think hard about spending the $25 to get the full text, but from the abstract, I have a couple of questions:

1) are these subjects controlled for number of years pitched? Could the pitch count be correlated but not causal? I'm really not saying yes or no to this question, but it is unclear from the writup.

Assume that the cause was how many months one pitches a year (so the arm doesn't recover in the off season). Would it not also be plausible that the best pitchers, the ones with the largest pitch counts, would also be the ones that pitch more months out of the year?

2) It appears that the types of pitches are related to arm injuries (curves and sliders). When I was a kid, our league forbade the throwing of breaking pitches for this reason (though I think it is doubtful they had statistically valid results to reach this conclusion)

What they say about pitch counts in the results is this:
"There was a significant association between the number of pitches thrown in a game and during the season and the rate of elbow pain and shoulder pain." I have no way of making decisions about what to do about pitch counts based on this, and I presume that if they had firm results related to specific pitch counts they would have indicated as such. (They did indicate specific results with curves and sliders).

There is a related article on the web site you provided here. It was a self-report study, but groups many of these effects together.

Let me also say that the whole problem that I am addressing may be due to the news article I cited being poorly written, not including the evidence for pitch count of 85 having a significant effect. I really have no dog in this fight--I was just struck by the article having mismatched evidence and conclusions. That said, if there is a good article cited pitch counts (especially if the number 85 shows up), I'll be very interested.

Van Scott said...

You're welcome for the article, and you raise some good questions on the design of the experiment(s) (are subjects controlled?) and some of the key constructs.

The "months pitched" hypothesis sounds plausible to me, though I know almost nothing about baseball (I did read Moneyball ).

It seems to me there would be a strong correlation between "pitch count" (in a game) and "months pitched" (in a year). And geography, as you posted earlier, may also be relevant. But "causality" would depend on random selection and assignment (I think).

I'll check to see if I have access to the fulltext of any of the articles through our local library and EBSCO host (especially one on pitch counts).

This discussion is very interesting to me in terms of:

1. What is the issue? (Is there an issue?)

2. How is it studied? (adequate data? good methodology?)

3. What decision (policy) results from the conclusion?

4. How is it reported?

It appears to me that there is a cluster of studies that address various factors of baseball injury. The news story probably used just one for underlying evidence.

I have no dog in this fight either, and I appreciate your eye for parsing out data and decisions in news articles.

Van Scott said...

I checked our local library's Academic Journal Premier (from EBSCO), but I don't have easy access to any of sports medicine journals that contain the articles--just abstracts.

But it was a interesting discussion. Thanks for posting.

Will Dwinnell said...

I've noticed the frequent use of another type of pseudo-evidence.

Here's how it works: Some predictive indicator occurs or does not occur. Likewise, some outcome of interest occurs or does not occur. The pseudo-evidence takes the form of an argument which focuses on the probability of the outcome, given the presence of the predictive indicator.

Here's an example: "Studies show that 80% of automobile accidents occur when at least one driver is doing something illegal, so you should avoid driving illegally." Notice the focus on "how high" this probability is. A more interesting question, in my opinion, is: "What is the probability of an automobile accident, with and without illegal driving?" It may turn out that accidents are more probable when drivers drive legally.

Any situation with a single binary predictor and a single binary outcome may be described by a 2x2 table of frequencies. It's important to know which fraction(s) of that table measure what one's interested in.