The 2021 Men’s NCAA Basketball Tournament is now over, and the Baylor Bears are your 2021 champions. With the 2021 season now officially complete, it is time to look back and reflect on the year that was. The subject for today is one that has always fascinated me: how to quantify the performance of coaches and teams in the NCAA Tournament.
There are dozens are different ways to measure performance in March. The most obvious ones include stats like total wins, overall win percentages, Final Fours, and national titles. The Spartans and head coach Tom Izzo are in the top-10 in all four of these categories in the modern era (which I define as from 1979 on, which was the first year the teams were seeded and that the tournament expanded to beyond 32 teams).
That said, not all paths in the NCAA Tournament are equal, and therefore making comparisons based on these raw metrics does not provide a complete picture. In fact, the structure of the tournament is specifically designed to give the better teams the easiest possible paths to the Final Four. Is there a way to normalize this performance and put teams on a more level playing field?
One metric that attempts to do this is called Performance Against Seed Expectation or “PASE.” This metric dates back to at least 2008 and is straight-forward to calculate. It is designed to measure the number of games that a team or coach wins in each tournament relative to the average number of games that a team with the same seed has won historically.
For example, No. 1 seeds since 1985 are 484-121 at-time in the NCAA Tournament. There have been a total of 36 tournaments played since 1985, each with four No. 1 seeds. So, this means that No. 1 seeds historically win 3.36 games per tournament (484 divided by 36 and then divided by four).
In 2021, Baylor won six total games, or 2.64 game more than an average No. 1 seed. Therefore, Baylor’s PASE in 2021 was +2.64. Gonzaga won only five games, or 1.64 more than average for a PASE of +1.64. Michigan won only three games, or 0.36 less than average for a PASE of -0.36, while Illinois won only one game to finish with a PASE of -2.36.
This same calculation can be performed on the No. 2 seed (who on average win 2.35 games per tournament), No. 3 seeds (who win 1.85 games per tournament), and so on down to the No. 16 seeds who win only 0.01 games per tournament, not counting the more recent play-in games.
The PASE metric can be calculated based on either team or coach. Figure 1 below shows the current top-15 teams and coaches going back to 1979.
As we can see Michigan State and Tom Izzo both score very highly in both metrics. Over a 40-year span, MSU ranks fourth overall in PASE (behind only Louisville, Kentucky, and North Carolina), having won over 14 tournament games more than expected based on the historical averages. I should also note that those calculations also do not discount vacated tournament wins for teams who have been caught cheating, such as Louisville or our friend down in Ann Arbor.
Coach Izzo does even better. As Figure 1 indicates, Tom Izzo’s current PASE score of +13.51 is currently No. 1 out of all 649 coaches to have appeared in the NCAA Tournament since 1979. This score is actually down from the value of +15.24 following the 2019 tournament and Coach Izzo’s score of +16.38 following the 2015 tournament, which is the highest score of any coach ever.
Simply having a score over +12 is a very significant accomplishment. Based on my records, only three other coaches in history have ever achieved a score that high: Mike Krzyewski of Duke (who peaked in 2001 and has been in decline, on average, ever since), Denny Crum from Louisville, and Rick Pitino also of Louisville (and currently at Iona).
A trip to PARIS
While PASE is a useful and well established metric, it does have its flaws and limitations. It works very well in the years when the tournament included 64 or 65 teams (from 1985 to 2010), but for the timeframes before and after that, things get messy.
For example, Michigan State was a No. 11 seed in 2021. Most No. 11 seed will play a No. 6 seed in the first round, but as a participant in the First Four, MSU instead played No. 11 UCLA. Basically, in the current tournament format, not all No. 11 seeds take the same path, and this causes inconsistencies in the PASE calculation.
Fortunately, I have developed additional metrics that do not have these limitations. Back in 2015, before I had heard of PASE, I created two new metrics that attempt to do almost the same thing. However, I approached the problem slightly differently.
The first metric I now call PARIS (for Performance Against Round-Independent Seed). Mathematically, PARIS is very similar to PASE, but it compares a team’s performance in each individual game instead of in each set of games that make up one tournament run. I perform the same calculation of actual wins versus expected wins, except that the comparison is made based on each seeds performance, round-by-round.
To again return to the example of the No. 1 seeds in 2021, these seeds historically win 99.3 percent of first round games, 85 percent of second round games, 80 percent of third round games, and then roughly 60 percent of games in round four, five, and six.
For Illinois in 2021, the Fighting Illini won their first round game and lost in their second round game. In those two games, a No. 1 seed is expected to win 1.84 of those game (0.993 plus 0.850), but since Illinois only won one game, the Illini’s PARIS score for 2021 is -0.84 (1.00 minus 1.84).
For Michigan, in the four games that the Wolverines played, the expected win total for a No. 1 seed is 3.24 (0.993 plus 0.850 plus 0.796 plus 0.602). As the Wolverines only won three games, their 2021 PARIS score is -0.24 (3.00 minus 3.24).
As you would expect, PARIS and PASE and strongly correlated, but do differ slightly. PASE tends to reward teams more for long tournament runs and punish teams more for losing early. A team like Illinois is punished for games that it never even had a chance to play (like a Sweet 16 game).
But, PARIS is more flexible in the way that it can handle play-in games and byes. Each game and round is handled independently, so the structure of the tournament doesn’t matter as much. In addition, the sum of PARIS for all teams and coaches over all years is equal to zero, meaning that it is truly a measure of above (positive) or below (negative) average performance.
Tom Izzo is also currently first of all coaches in the PARIS metric with a score of +8.39, while Michigan State overall moves up into second place all-time in PARIS, with a score of +9.15.
PADing the Stats
As I was developing the PARIS metric, it occurred to me that while comparing teams based on performance per round has some value, not all paths in the NCAA Tournament are equal, even for the same seeds. In 2021, for example, Baylor reached the Final Four by defeating a No. 16, No. 9, No. 5, and No. 3 seed, while its national semifinal opponent, No. 2 Houston faced a No. 15, No. 10, No. 11, and No. 12 seed. Clearly, Baylor faced a tougher path.
Michigan State fans are familiar with this effect as well. MSU has made the Final Four taking both a fairly tough route (for example in 2000, 2009, and 2019) but also a fairly easy route (in 2001 and 2010).
In order to try to account for these differences, it occurred to me that it is possible to make a more specific calculation of the historical odds for the top seed to win in any given seed combination. For example, when a No. 3 seed plays a No. 6 seed in the second round, the No. 3 seed wins 57.6 percent of the time. However, when the No. 3 seed plays a No. 11 seed, the top seed wins 67.8 percent of the time.
No. 3 seeds therefore should get more “credit” when they beat No. 6 seeds and they should be punished more if they lose to a No. 11 seeds. I created the Performance Against exact seed Differential or “PAD,” to account for this effect using the expected values derived using this methodology.
Interestingly, not only does Izzo also lead all coaches with the highest PAD (+7.97), but Michigan State University also has the highest PAD (+8.77) of all 352 Division One teams back to 1979.
While the individual PARIS and PAD scores are interesting, we can learn something new by comparing the results from each calculation. The PAD metric was designed to correct for the fact that some teams get easier draws than others due to upsets elsewhere in the bracket. So, if a team or coach has a higher PARIS score than PAD score, that team or coach on average has received an easier than expected NCAA Tournament path relative to all other teams.
The opposite is also true, as a PAD score that is higher than a PARIS score implies that team or coach has on average traversed a more difficult NCAA Tournament path. In other words, subtracting the PAD score from the PARIS provides a measure of the amount of “luck” that a team or coach has had in NCAA Tournament draws relative to all other teams and coaches.
With all that said, let’s look at some more data. Figure 2 compares this new “luck” metric (PARIS minus PAD) to just the PAD metric for all 649 coaches who have appeared in the NCAA Tournament.
This figure tells us almost everything we need to know about NCAA Tournament performance relative to seed expectation. The vast majority of coaches are simply clustered near the center of the graph. But, several notable coaches form a halo around this center.
These outliers can be grouped into four categories based on how lucky they have been (the y-axis, PARIS minus PAD) and how well they have done overall, relative to their seed (the x-axis, PAD).
From the Figure we can once again see that Izzo is really, really good in March and also slightly lucky, but by less than half of a game. The only two coaches that approach Izzo’s PAD score are the 20th century legends Denny Crum (Louisville) and the very unlucky Rollie Massimino (Villanova).
As for Coach Izzo’s contemporaries, we can see that Rick Pitino and John Calipari have had luck similar to Izzo. Jim Boeheim, John Beilein, Jim Calhoun, and Coach K are all very good, but they all also have won at least a game more than they should have just due to schedule luck in March.
No coach in NCAA Tournament history has been luckier than Billy Donovan, who benefited from a No. 2 seed losing to a No. 15 seed in his half of the bracket not once, but twice, in 2012 and 2013. On the flip side, recently retired Kansas and North Carolina coach Roy Williams has been the least lucky successful coach. Gary Williams and Lute Olson were also both fairly unlucky on balance.
Figure 2 also gives a sample of some of the coaches who have notably underachieved relative to their seed. Bill Self, Bob Huggins, Kelvin Sampson, and Tony Bennett all have negative PAD scores, but this analysis suggests that all four coaches have actually been more lucky than average in their tournament draws. It actually could have been worse.
This Figure also highlights the coach that has underachieved the most overall in the history of the tournament: Rick Barnes, previously of Texas and now at Tennessee. The math suggests that he should have over six more tournament wins that he actually has, based on the draws that he has been given over time. Gene Keady (PAD of -4.23) and Fran Dumphry (-4.18) both should send Rick a nice Christmas card each year.
PAKE My Day
All of the analysis shown above is based one way or another on the seed of each team. They all assume that teams with the same seed are all equal, which is also not true. In order to get the most accurate measure of a team’s or coach’s NCAA Tournament performance relative to expectation, it would best to simply use the win probability for all tournament games based on the Vegas betting line.
A “Performance against Vegas expectation” (PAVE) metric would achieve just this goal, and if I had all of that data back to 1979, I would certainly calculate it. Unfortunately, getting a complete set of that data is difficult, and I simply do not have it.
Fortunately, there is a way to estimate Vegas lines and win probabilities using Kenpom efficiency data. While this data is only reliable back to 2002, it does provide a solid 19 years of data to crunch in order to create a “Performance Against Kenpom Efficiency” or PAKE metric, as others have referred to it previously. Figure 3 below compares the PAD metric as calculated only back to 2002 to the PAKE metric in the same timeframe.
As we can see, PAD and PAKE are highly correlated, but there are a few notable data points that do deviate from the best fit-line, and Coach Izzo is one of them. Izzo’s PAD since 2002 is still the best of all coaches, but his PAKE is a bit lower and only fifth overall, behind Jim Boeheim, Roy Williams, John Calipari, and John Beilein.
My interpretation of this effect is that it relates to the accuracy of the seeding done by the committee, on average. Based on the relative seeds of teams in Coach Izzo’s path since 2002, Michigan State has won six more tournament games than expected.
But, the Kenpom efficiency projected point spreads suggest that this number is a little high. Based on Kenpom (a more accurate estimate of relative team strengths), Michigan State has actually only won about four-and-a-half more games than expected. This can mean only one of two things. Either MSU’s opponents, on average, were given better seeds than they deserved (which is unlikely, as this is almost certainly random) or MSU, on average, has been given a worse seed than it deserves.
If my interpretation is correct, Coach Izzo has been the most under-seeded coach in the modern history of the tournament, and it is not close. While most other coaches fall very close to the trend-line, a few also standout as being consistently given a better seed than they deserve. Most notable is this group is Jim Beoheim, Roy Williams, and especially Bill Self.
While that completes this particular dive in the data of the NCAA Tournament, in the coming days and weeks I will be continuing to dig through the numbers, looking for interesting stories to tell. Stay tuned and Go Green.