clock menu more-arrow no yes mobile

Filed under:

Hoops Analysis: Easy roads and hard paths in the NCAA Tournament

Not all paths in March are created equally. I found a way to quantify them

Michigan State v Louisville
Brandon Dawson cut down the nets to make the Final Four in 2015. How tough was MSU’s road that year?
Photo by Maddie Meyer/Getty Images

The 2021 NCAA Basketball Tournament and season have come to a close, but a new season always provides new data and new stories to tell about that data. In the 2021 NCAA Tournament, one storyline was the apparent ease at which No. 2 seed Houston was able to reach the Final Four. The Cougars’ path to the final weekend went through a No. 15 seed (Cleveland State), a No. 10 seed (Rutgers), a No. 11 seed (Syracuse), and finally a No. 12 seed (Oregon State).

This marked the first time in history that a team had reached the Final Four without facing a single-digit seed. By some measure, this implies that Houston had the easiest path in history to the Final Four. But, for me, this type of discussion always begs the question of how to quantify something like the difficulty of an NCAA Tournament path.

My approach to try to answer this question is to define a benchmark or reference team and to then calculate the odds that this hypothetical team would reach the Final Four given any arbitrary tournament path. Fortunately, tempo-free metrics such as Kenpom efficiency margins provide just such an opportunity to quantify these odds.

Historical data suggests that an average Final Four team since 2002 has a pre-tournament Kenpom adjusted efficiency margin of around +25.4. (This means that an average Final Four team would be expected to beat an average division one team by about 25 points in a game made up of 100 total possessions for each team.) This value is very similar to the efficiency of MSU’s 2005 Final Four team. So, this is effectively the reference team.

Using efficiency data, it is possible to a project a point spread and therefore a victory probability for any arbitrary team versus this reference team as long as the efficiency margin data is available. This is generally the case for all teams back to 2002 on

Calibrating the effect of bracket position

As a first step, I wanted to understand the general benefit that teams get from earning a higher seed. To achieve this, I set up a simulation of sorts involving a theoretical bracket made up entirely of teams with the historically average efficiency margin for teams of that seed.

For example, the average efficiency margin of all No. 1 seeds back to 2002 is +28.9. This corresponds to a team such as Michigan State’s No. 1 seeded team in 2012. An average No. 2 seed historically has an efficiency margin of +23.5, which is similar to MSU’s 2009 team, and so on. These teams make up the theoretical bracket.

I then calculated the odds that the reference team (MSU’s 2005 team) would make the Final Four if they were inserted into this theoretical bracket in any of possible seed positions from No. 1 all the way to No. 16. I also assumed that in every round the reference team faces the highest seeded possible opponent (i.e. that there are no upsets). The result of this set of calculations is shown below in Figure 1.

Figure 1: Odds of a reference, average Final Four team making the Final Four in a bracket of historically average teams if the reference teams were to be inserted as any seed and no upsets occur.

This figure shows us the true benefit of being a top seed. In this scenario, the team in the No. 1 seed position has a shade over a one-in-five chance to win the four games needed to make the Final Four. For teams with the path of the No. 2 seed, those odds fall by five percentage points to 15 percent. The odds continue to drop to 12 percent for the No. 3 seed path, 11 percent for a No. 4 seed, and 10 percent for a No. 5 seed.

Interestingly, once a team drops to a No. 6 seed, the odds for the reference team to reach the Final Four are essentially equal (eight to nine percent) for all paths from the six-line down to the 16-line. As a reminder, this calculation assumes that the efficiency of the reference team is fixed. So, whether they are a No. 6 seed or a No. 11 seed, they are still equally as good.

This analysis already gives valuable insight. Basically, there is a clear advantage to the path of a No. 1 or a No. 2 seed, independent of the real strength of that team. There is a slight advantage to the path of a No. 3, No. 4 or No. 5 seed, but after that it really doesn’t matter with regards to the odds of making a Final Four.

The history of the NCAA Tournament is filled with examples of teams that cycle up right before the tournament, either due to players returning for injuries or simply figuring things out as the season progresses. These teams are likely better than the seed that they have been given and the average efficiency that the metrics assign to them. The good news for teams is this position is that whether they are given a No. 3 seed or a No. 11 seed, their odds to make the Final Four are roughly the same, all other things (such as upsets and either over-seeded or under-seeded opponents) being equal.

Easy Paths and Hard Paths

The problem is, not all paths are equal. Using a similar method, it is possible to estimate the relative ease or difficulty of any of the paths that previous Final Four teams have actually traveled on their way to the final weekend. In this case, the same reference team (approximately as good as MSU in 2005) is used, but instead of calculating that team’s Final Four odds against a theoretical, average bracket, the efficiencies of the teams from the actual NCAA Tournament paths are used.

For example, to compare the paths of both Baylor and Houston in the 2021 Tournament, I first looked up the pre-tournament efficiency margins for the four opponents of each of those teams en route to their meeting in the 2021 Final Four. As mentioned above, for Houston, these teams are Cleveland State, Rutgers, Syracuse, and Oregon State. For Baylor, these teams are Hartford, Wisconsin, Villanova, and Arkansas.

Then, I estimated the odds that the reference team (Michigan State in 2005) would have to win games against each set of four teams. The product of the odds of each set of four games give the odds to make it to the Final Four on that path.

I made the same calculation for each Final Four team’s path to the Final Four back to 2002. I also pulled the numbers for MSU’s Final Four teams in 1999, 2000, and 2001 for reference. For comparison, I also calculated the Final Four odds for each path assuming that each opponent was an average team for that seed and not the actual opponent.

For example, in the case of Houston in 2021, I calculated both the odds for the reference team to beat an average No. 15, No. 10, No. 11, and No. 12 seed and the odds for the reference team to beat Houston’s actual opponents. The results of this calculation are shown below in Figure 2.

Figure 2: Comparison of the difficulty of different paths to the Final Four, based on the odds that a reference team would reach the Final Four using the path of each team

All 76 teams to play in a Final Four since 2002 (plus the three additional MSU Final Four teams) are shown in this Figure, and there is a lot to observe.

The x-axis shows the actual odds or true, normalized difficulty of each team’s path. Based on this analysis, it is true the 2021 Houston team did, in fact, have the easiest Final Four path of any team in history. The reference team had a 39 percent chance to win those four games.

The most difficult tournament path in recent history belongs to the 2019 Texas Tech squad (the team that beat Michigan State in the Final Four that year). In this case, the reference team only had a seven percent chance to reach the final weekend. Houston’s path was five-and-a-half times easier than Texas Tech’s path two years prior.

Interestingly, Texas Tech’s path in 2019 (7.0 percent) was actually slightly harder than the path traveled by both UCLA in 2021 (7.4 percent) and VCU in 2011 (7.8 percent). As participants in the First Four, UCLA and VCU had to play five games instead of four to make the Final Four, but Texas Tech’s four-game slate still graded out to be harder.

Note that dotted vertical orange line in Figure 1 represents the median of the data sets. So, the teams to the left of this line had a path that was easier than average, while the teams on the right side of the graph had a harder than average path.

As for Michigan State, Tom Izzo has clearly experienced both some of the easiest, as well as some of the most difficult tournament paths in history. Half of Izzo’s Final Four teams fall to the right of the orange line, while the other half are on the left.

MSU’s most difficult path in the Izzo era was in 2015 when the Spartans faced No. 10 Georgia, No. 2 Virginia, No. 3 Oklahoma, and No. 4 Louisville. The Spartans’ softest Final Four path was in 2001 when MSU faced No. 16 Alabama State, No. 9 Fresno State, No. 12 Gonzaga, and No. 11 Temple. Coach Izzo’s other six paths are closer to the median.

The y-axis on Figure 1, which gives the odds for the reference team if they were to face an average version of each seed, gives us some additional insight. If a data point falls above the diagonal line, this implies that the path that team took in reality is actually harder than it appears based simply on the seeds of the opponents. The opposite is also true. Data points that fall below the diagonal line represent teams whose Final Four path was easier than expected, based on the seeds of the opponents.

These differences can be more easily understood by looking at a selection of the Final Four paths in more detail. Table 1 below gives the opponent details for the teams that took the 20 easiest paths to the Final Four.

Table 1: Detailed opponent data for the 20 easiest paths to the Final Four

In the case of Michigan State in 2001, based on just the seeds that the Spartans faced (No. 16, No. 9, No. 12 and No. 11) this path should be the easiest path in history. If those four teams were merely average teams of that seed, the reference team’s odds to make the Final Four would be about 43 percent, which is slightly easier than the 41 percent odds that the reference team would have using a set of average seeds equivalent to Houston’s path in 2021.

But, if the efficiencies of the actual opponents are considered, MSU’s path in 2001 drop to seventh place, as shown in Table 2. In this case, for each opponent the Kenpom efficiency margin relative the average margin for that seed in shown in the table.

For MSU in 2001, once the Spartans got past the first round, the next three opponents were above average for their seed. Specially, No. 12 Gonzaga’s Kenpom efficiency margin was +2.3 points better than an average No. 12 seed. The Zags efficiency was more similar to an average No. 10 than a No. 12 seed.

Furthermore, No. 11 Temple’s efficiency margin was +6.8 points better than average. That would make the 2001 Temple team more similar to a No. 4 seed. Unfortunately for the Spartans, their national semifinal opponent in 2001, No. 2 Arizona, was also significantly above average for their seed...and it showed.

MSU’s 1999 Final Four team also had a path that was harder in reality that it might look on paper. Sweet 16 opponent No. 13 Oklahoma and regional final opponent, No. 3 Kentucky both had Kenpom efficiency margin’s significantly above average for their seed.

I should note here that in one of my previous analyses, I suggested that there was data to support the idea that Michigan State, on average, is the most under-seeded team in recent history of the NCAA Tournament. But, part of my reasoning was that it seems unlikely that MSU’s Tournament opponents were, on average, under-seeded.

However, the data in Figure 1 suggest that for the 1999 and 2001 Final Four teams, that was certainly the case, as they are two clear outliers on the figure. It seems that the committee might have both under-seeded MSU and over-seeded the teams in MSU’s path in the past.

On the other side of the coin, there are several notable teams whose NCAA Tournament paths were actually quite a bit easier than they appear just based on seeding. For example, the 2005 Illinois team (No. 1 seed), the 2004 UCONN team (No. 2 seed), and especially the 2006 UCLA team (No. 2 seed) all had paths that grade out as weak, not just due to the seeds that they faced, but also due to a set of opponents that appear to have been overrated.

UCLA’s 2006 path to the Final Four was particularly soft. All four of the Bruins’ opponents were notably below average, based on Kenpom data. UCLA’s Sweet 16 and regional final opponents (No. 3 Gonzaga, -7.6 and No. 1 Memphis, -6.6) were particularly weak. The data suggests that those two teams graded out closer to a No. 11 seed and a No. 3 seed, respectively.

Finally, using the same method, it is also straightforward to quantity the level of difficulty for each team to both reach the championship game and to win the national title. Briefly, the top-five easiest paths to the championship game (with the normalized odds for the reference team) are:

  1. North Carolina in 2016 (22.8 percent)
  2. UCLA in 2006 (21.7 percent)
  3. Texas in 2003 (21.7 percent)
  4. Michigan in 2018 (20.2 percent)
  5. Illinois in 2005 (21.1 percent)

Here are the top-five overall easiest NCAA Tournament paths (including the title game, for those that made it):

  1. UCLA in 2006 (11.7 percent)
  2. North Carolina in 2016 (10.6 percent)
  3. Florida in 2006 (8.9 percent)
  4. Villanova in 2018 (8.0 percent)
  5. Louisville in 2013 (7.8 percent)

That’s all for today. Until next time, enjoy, and Go Green.