clock menu more-arrow no yes mobile

Filed under:

Linking Laconically has some must-see statistical wonkery for you

[Not so much with the laconicism on the first two links.]

Must-See Statistical Wonkery of the Basketball Variety

I've had an idea bouncing around in my head for a couple years about building a statistical model to predict the future performance of college basketball teams based on the percentage of minutes they return from the previous season and the quality of their incoming recruits.  The problem has been that I do not possess the methodological prowess to actually, you know, implement this idea.  Thankfully, Dan Hanner does.  The results:

As expected, teams with the most freshman possessions are the most likely to improve. This isn’t so much a, "freshman work hard in the off-season" effect as a "boy last year sucked because we had to give the freshman so many shots" effect. Returning juniors have a slightly bigger effect than sophomores, but the difference is not statistically significant.

For departing players, the higher the individual efficiency or ORtg of the departing player, the more the team’s offensive efficiency is expected to drop. In fact, losing highly inefficient players is not harmful at all. If a departing player’s efficiency rating is below 91.2, his departure benefits the team’s offense. But the departure of players above this level hurts the offense, and the departure of highly efficient players is very costly.

Top 10 and Top 100 Freshmen have a small impact. For every Kentucky this year, there is a North Carolina this year. Each Top 10 Freshman recruit increases team efficiency by about 1.13 points. Each Top 100 Freshman recruit increases offensive efficiency by about 0.26 points. I’ve been playing around with recruiting data off and on for over three years and I have never gotten a huge effect. I’m to the point now where I really believe the average effect is minimal. For every John Wall this year, there is a Lance Stephenson this year.

New coaches (first-time or school-changers) have a negative impact, on average. While there are obviously many turnaround stories, an equal number of coaches inherit disasters where simply treading water can be viewed as a success in year one.

The list of schools for which Hanner's model would have predicted large improvement from last to this season make intuitive sense: teams returning a lot of talent (Kansas, Ohio State), teams returning a lot of depth (Minnesota--which hasn't followed through on what the numbers/intuition would have predicted), and teams that were really, really bad last year (Indiana).

No model is going to be perfect, of course; Syracuse was expected to experience a significant decline.  So was Wisconsin (to a lesser extent).  I'm pretty sure it would be impossible to build a statistical model that explains why the Badgers never suffer when their stars depart (outside of including a dummy variable for "team is coached by William "Bo" Francis Ryan, Jr.)

So what does the model say about how efficiently Michigan State should be playing this season based on what they did last season?

Offense Defense Margin
Actual 2009 115.0 88.4 26.6
Predicted Change 0.7 (0.8) 1.4
Predicted 2010 115.7 87.6 28.0
Actual 2010 113.1 90.4 22.7
Difference (2.6) 2.8 (5.3)


Basically, MSU should have gotten a little better on both ends of the court--but not so much they would have gone from being the 8th most efficient team in the country (their final 2009 ranking) to the second most efficient team (as implied by the preseason polls).  And given that the model doesn't know just how good Travis Walton's departing minutes on defense were, you probably have to discount the expected improvement on that end.

That leaves us pretty much where we knew we were: The offense hasn't improved enough to maintain MSU's status as a legitimate top-ten team.  And that's simply a function of not enough returning players showing improvement from last season to this one.  While two freshman from last year (Draymond Green and Delvon Roe) have indeed become more efficient as sophomores (that's an understatement in terms of the role Green has played this year), the third freshman has not--and the sophomores/juniors from last year have stagnated or receded in terms of offensive efficiency (Lucas/Summers/Morgan).  (The exception is Chris Allen, but he's improved his efficiency by substantially reducing his role in the offense.)

There's more to come in Hanner's series.  Be sure to follow along.

Must-See Statistical Wonkery of the Football Variety

College sports statistical graphic of the year:


The graphic is brought to you by Brian Freemeau, of Football Outsiders fame.  The bottom line takeaway is that, relative the size of the Division 1-A/FBS universe, the number of games played between top-25 teams on an annual basis has fallen by 40% over the last 20 years.

Brian Cook on this phenomenon:

Fremeau suggests one of the main culprits is conference expansion, which seriously limits the ability of independent teams to act as bridges between conferences. There are a few others: games against I-AA teams have also doubled, and I-A has added 14 teams, none of them any good, in the interim. When you've got so many more options for easy wins and the incentives are all aligned towards those wins, the results write themselves.

There are plenty of ways to fix this: ban I-AA games, force teams to play at least five true road games a season, banish teams that have no business in I-A back to where they belong. Since the men in charge regard this change as a feature, not a bug, none of them will be undertaken. As you were.

My take: The major American sport that has the smallest amount of data available on how its teams compare to one another based on regular season performance is the one sport that has chosen to explicitly rely on that data to determine who participates in its postseason championship event, while simultaneously limiting the size of that event to the smallest number of teams that the laws of mathematics will allow.

So there.

Other Stuff You Should Probably Have a Look At

[Worked my way back down to laconicism at the end there.]