Essentially, I'm figuring out how much regression is needed for shooting rates among the 12 Big Ten rates using the method Tom Tango laid out in this blog post. This is kind of an update on the first of my three posts last March.I'm using full season data and not conference-only, so keep that in mind. I've found that over the course of a single season, a team's three point shooting should be regressed anywhere between 50-75%. That is to say that anywhere from half to three quarters of a team's observed three point talents is some sort of variance and the other half to one quarter is true skill.
We'll use last years Michigan State squad as an example. I've found that the point in which a team should be regressed 50% to the mean is 809 three point attempts. In my 123 team season data set, the average Big Ten team fired 549 treys. In 2011, Michigan State took 595 three pointers on the year, meaning they should be regressed about 58% to the mean. In other words, it was about a 60-40 split in terms of variance in their observed talent and actual skill.
They shot .353 from deep and the average Big Ten team in my sample connected at a .352 clip. So basically, you take 58% average shooting and 42% of MSU's observed talent: (0.353*0.42)+(0.58*.352) which churns out Michigan State as a regressed 0.352 three point shooting team.
Obviously, this doesn't mean much for a team that's so close to average like Michigan State. However, it's the outliers -- either in a high percentage of makes or a low amount of threes attempted -- that it makes the difference.
Let's run through another team from 2011: the Iowa Hawkeyes. Iowa took 465 three's last year (compared to MSU's 595) and the average team in my sample took 549. Because of this, we regress Iowa's woeful 0.314 shooting from behind the arc 64% towards the league mean which bumps them up to a 0.338 team from deep.
I don't want to belabor the method anymore at the risk of this becoming far more boring than it already is. So, here's a graph of the amount of regression with the amount of attempts for three-pointers. Click all images to enlarge.
And for two-pointers
And for free throws
Now, what does this all mean? Well, I've used this for sort of an 'expected' points total for each team this year. Given that we're still about a third of the way through the basketball season, we're on the very high end of all of the regressions graphs. I used the regressed shooting rates to produce an expected point total for each team given their amounts of each type of shot taken.
The chart is sorted by the point differential which is just the expected point total subtracted from the actual point total. It makes a big dent on the ends showing that Indiana has shot remarkably well this year. Can they keep it up? Not likely. Right now they're at about 5 extra points per game due to their shooting and the highest total I have is about 3 points per game which was the 2004 Michigan State squad. On a per-game level it doesn't sound that bad, but right now Indiana's +81 is the third most in my data set and the highest full-season total I have is by Illinois in 2005 and Ohio State in 2011. Ohio State makes sense given the unique amount of terrific shooters they had on last years team, led by Jon Diebler. Illinois 2005 squad had Deron Williams, Luther Head and Dee Brown on it. Christian Watford is going to cool down from his 49% clip from deep -- barring some Derrick Williams-like season this year where he continually hits EVERYTHING -- and given that he's taken about 20% of Indiana's three balls this year, it'll affect the entire team.
Below are all of the shooting rates -- both raw and regressed -- that I have for 2012, sorted alphabetically.
|TEAM||3P REG||3P%||r3P%||3P Delta||2P REG||2P%||r2P%||2P Delta||FT REG||FT%||rFT%||FT Delta|
3P Reg is the amount of regression on three point shooting, 3P% is the raw percentage of three made, r3P% is the regressed three point percentage and the Delta is the difference between the regressed and raw shooting rates. This is the same structure for two-point shooting and free throw shooting.
Here, we've got the number of points gained or lost by each type of shot due to regression:
Here, PtsL is the number of points lost by the team regression in each category. The Total is the sum of all the points lost (or gained).
Apologies for the charts not being amazing but I feel they get the job done. You can download my entire data set here. Just File > Download > format of your choice.
Edit before posting: the 'total' column in my last chart is showing Indiana at -81 where as the chart before they were +81. It's saying the same thing. One is saying they're over performing by 81 points and one is saying they should have scored 81 points less, essentially. Just a mistake on my part, presentation-wise..