clock menu more-arrow no yes

Filed under:

Bad Betting Advice, NCAA Tournament Edition: Using Math to Make the Picks

New, 5 comments

Need a little help filling out your bracket? In part two of this series, I break down the full bracket and apply a little math to find those pesky upsets, dark horses, and the eventual champion.

NCAA Basketball: NCAA Tournament-First Four Practice Brian Spurlock -USA TODAY Sports

As tip-off to the 2021 NCAA Basketball Tournament approaches, serious and casual fans of the sport have likely already made at least one attempt to fill out a bracket for their office pools. Before you make that final submission, I have a bit of advice for you.

In the first post of this two-part series, we reviewed the historical patterns in NCAA Tournament upsets and explained the mathematical underpinnings of why those patterns exist. With this information, we know why the chaos exists and what a “typical” NCAA Tournament bracket should look like. But, we still do not know how to apply that know-how in order to truly dominate our office pools.

Well, you are in luck, dear reader, as I am about to give you the keys to the castle.

As I alluded to in part one, NCAA Tournament results can be predicted to some extent based on point spreads, and point spreads can be estimated using efficiency data, such as the values published by Kenpom. If we combine this knowledge with the baseline information of how a typical tournament usually plays out, it gives us a chance to identify where the most likely upsets will happen.

To this end, I will now take a closer look at the 2021 bracket from two angles. First, I will give an overview of each region. Second, I will go round-by-round and provide recommendations on the most likely picks, based on the data. Let’s start with the West Region.

West Region

For each region, I will present one figure and one table to provide an overview of what we can expect to see on the hardwood. Figure 1 below compares the Kenpom adjusted efficiency margin for the 18 teams in the West relative to the historical average Kenpom efficiency of teams with the same seed. This tool is very useful to quickly see which seeds in the region are weak or strong, relative to historical averages.

Figure 1: Analysis of the 2021 West Region based on Kenpom adjusted efficiency margins, compared to the historical averages

At a first glance, a few key things stick out from Figure 1. First, Gonzaga is really, really good, and Iowa is also a very strong No. 2 seed relative to historical averages. As for the rest of the region, USC stands out as a strong No. 6 seed and No. 15 Grand Canyon has a very strong efficiency margin for a team seeded that low. It likely won’t matter, but that is interesting.

Table 1 below gives the Monte Carlo simulation results for the 18 teams in the West Region.

Table 1: Simulation results for the 2021 West Region, compared to historical averages

This table (and the one that will follow) contains a ton of information. In the left-most section is the current Kenpom efficiency margin for each team. Next to that column is the Kenpom efficiency margin relative to the historical average efficiency margin for teams with that seed. This is essentially the same data plotted in Figure 1.

For example, Gonzaga is actually entering this tournament with the best efficiency margin on record since 2002 when reliable Kenpom data is available. The Zags’ margin of +38.05 is 9.31 larger than the historical average efficiency margin of all previous No. 1 seeds.

In the middle section of the table is the results of the Monte Carlo simulation of the full tournament utilizing Kenpom efficiencies to project point spreads and single game odds. The number in each cell is the odds for that team to advance to the round shown in the column heading.

My simulation predicts that the Gonzaga Bulldogs have an 80 percent chance to reach the regional final, a 63 percent chance to reach the Final Four, and a 37 percent chance to win the National Championship. Note that these odds are the best of any team on record and slightly better than the odds for the 2015 Kentucky team (35 percent) who held the previous record.

That said, these numbers are somewhat meaningless without context. That is where the final section of the table comes in on the far right. These are the odds for each team to advance to each round relative to the odds for an average team of that seed in a tournament filled with other historically average teams.

For example, according to Kenpom, the Iowa Hawkeyes are a strong No. 2 seed. As a result, Iowa’s odds to make the Sweet 16 are 8.5 percent better than an average No. 2 seed and their odds to make the regional final are 11.5 percent better than average. However, due to the presence of Gonzaga in the half of the region, the Hawkeyes have a below average (by two percent) chance to make the Final Four.

Note that I have also sorted the teams in Table 1 according to their odd to win the region (i.e. advance to the Final Four). Already, we can start to see the main dark horse team emerge: USC. Despite being just the No. 6 seed and having only the fourth best Kenpom efficiency in the region, the Trojans have the third best odds to make it to the final weekend.

South Region

Figure 2 below gives visual summary of the 16 teams that make up the South Region and it is followed by the simulation results table.

Figure 2: Analysis of the 2021 South Region based on Kenpom adjusted efficiency margins, compared to the historical averages
Table 2: Simulation results for the 2021 West Region, compared to historical averages

As Figure 2 and Table 2 show, almost every team in the South Region, with the exception of No. 12 seed Winthrop, is above average for their seed. A few teams, such as No. 5 seed Villanova, No. 6 Texas Tech, No. 11 Utah State, No. 13 North Texas, No. 14 Colgate, and especially No. 8 and No. 9 seeds North Carolina and Wisconsin are noticeably above average.

On balance, the South looks like a strong region, which is depressing Baylor’s odds to make it to the Final Four. In particular, Baylor’s second round game against the winner of the Wisconsin/North Carolina game could be a potential upset. In addition, on paper, Villanova looks to be the most likely dark horse in this region.

That said, this is the perfect time to mention the obvious caveat to this analysis and methodology. Kenpom efficiencies are simple averages of a team’s performance over the entire year. In the case of Villanova, they had a key injury late in the season when Collin Gillespie torn his MCL, and the Wildcats have not played well since. As a result, we cannot trust the numbers associated with Villanova in Table 2, in my opinion.

Midwest Region

Continuing on to the Midwest Region, Figure 3 and Table 3 summarize the key metrics for the teams in that region.

Figure 3: Analysis of the 2021 Midwest Region based on Kenpom adjusted efficiency margins, compared to the historical averages.
Table 3: Simulation results for the 2021 Midwest Region, compared to historical averages

According to Figure 3 and Table 3, the teams that look to be more dangerous than usual are No. 8 Loyola, No. 9 Georgia Tech, No. 2 Houston, No. 6 San Diego State, and No. 16 seed Drexel.

On the flip side, this bracket also seems to have a few potential weak links in No. 3 West Virginia and No. 4 Oklahoma State. Also, No. 12 Oregon State and No. 14 Morehead State are below average for their seeds as well.

In general, this looks like a sneaky tough road for the Fighting Illini. The boys from Champaign are an above average No. 1 seed, but the odds for them to make the Final Four are barely above average. In the first round, the Illini draw an above average No. 16 seed in Drexel, and after that, they have to face the winner of the No. 8 No. 9 seed game between Loyola and Georgia Tech, who are both well above average for their seed.

If the Illini survives until the regional final, it is more likely that they will meet No. 2 seed Houston who Kenpom currently estimates as the strongest of the No. 2 seeds and the most likely one to win the National Title. In addition, No. 5 seed Tennessee also looks to be above average and a possible dark horse.

In a normal year when all teams have played a full non-conference schedule, I think that the analysis above would be pretty solid. However, I do have some doubts as to whether mid-major teams like Houston and Loyola are actually as good as their current Kenpom averages suggest. If they are, Illinois might be in some trouble.

East Region

Finally, let’s take a look at the East Region.

Figure 4: Analysis of the 2021 East Region based on Kenpom adjusted efficiency margins, compared to the historical averages
Table 4: Simulation results for the 2021 East Region, compared to historical average

Similar to the South Region, most of the teams in the East look to be above average, relative to the historical values. The main exception is No. 3 seed Texas.

As shown in Figure 4, it is the teams in the middle of the seed list, notably UCONN, Saint Bonaventure, LSU, and Maryland, along with No. 14 seed Abilene Christian, who are prime candidates to cause trouble.

Looking at the Final Four odds in this region, Michigan has the best odds on paper, which are actually better than an average No. 1 seed. However, with the injury to Isaiah Livers and another potentially tough matchup in the second and third rounds, I have my doubts that the Wolverines will live up to expectations.

Furthermore, No. 2 seed Alabama and No. 3 Texas also look to have lower than usual odds to survive the region. In total, this tells me that the East Region is the most likely one to descend into Madness, resulting in a team seeded lower than No. 3 making it to the Final Four. Right now, I like Florida State’s odds.

As for Michigan State’s chances, their odds to advance are clearly depressed by the fact that the Spartans have to play an extra game in the play-in round (which I do not explicitly handle in the comparison to historical averages). That said, I project that MSU has only a 14 percent chance to make it past BYU, only a five percent chance to make it to the Sweet 16 and less than a one percent chance for a magical run to the Final Four.

That said, these simulations are assuming that Michigan State is only as good as their Kenpom averages suggest. Michigan State has seen long odds before this season, and made the tournament anyway. If MSU can play close to their ceiling of potential, I do believe that the Spartans can make a run. But, the numbers do not currently support that idea.

Picking the First Round

Now that we have taken the full tour of each region, it is time to start making picks. In order to inform these decisions, I have taken the data presented above and converted it to a different format. In this case, I am calculating the odds of an upset for each first round matchup and comparing those odds to the simulated odds based on the historical averages. The result is shown below in Figure 5.

Figure 5: Projected upset odds for each first round seed combination relative to the historical averages.

Now we are getting somewhere useful. From Figure 5, we can clearly see which upsets in each pairing are more likely than average (the ones below the blue line) and the ones that are less likely (the ones above the line).

Starting with the top seeds, as expected there are no clear first round upset picks for the No. 1 and No. 2 seeds. For the No. 3 seed, however, things get interesting, as both Texas and Arkansas have been paired with stronger than usual No. 14 seeds in Abilene Christian and Colgate.

For the interest of full disclosure, I should say that I have done some math, which suggests that it is never a good idea to pick a team seeded No. 3 or higher to get upset in the first round. After all, the odds for Texas to win are still 75 percent. In the case of Colgate, the Raiders played such an abbreviated schedule that I simply don’t trust the validity of their Kenpom efficiency anyway.

But Abilene Christian over Texas? I am very, very tempted to make that pick, especially since my analysis suggests that Texas would likely lose to No. 6 BYU in the next round anyway (if they make it that far). If you are feeling bold and want a good “big” first round upset, that is the one that this analysis suggests.

As for the No. 4 seeds, there is no clear upset recommendation on this line this year. Purdue and Oklahoma State have slightly lower odds than usual, but not by much. If I were to make a recommendation, however, this is a case where the intangibles might play a leading role. Virginia had to back out of the ACC Tournament due to COVID issues and their roster status seems to be in question. If a No. 13 seed is going to win this year, my pick is for Ohio to beat the Cavaliers.

As for the No. 5 seeds, this frankly looks like a bad year for the famous No. 5/No. 12 upset. The No. 12 seeds all look weak this year and the No. 5 seeds are relatively strong. It might make sense to pick Villanova to lose without Gillespie, but Winthrop is a very weak No. 12 seed. My pick is for Big East Tournament Champs No. 12 Georgetown to upset No. 5 Colorado. The current Vegas lines also support this pick as the most likely.

There are also no clear upset picks on the No. 6 seed line. USC looks safe, but the other three teams could run into trouble. Of course, the best outcome for Spartans fans would be for MSU to knock off BYU after defeating UCLA. That projects to be the second most likely upset on this line, second only to Utah State over Texas Tech. History tells us that one or two of these games is a likely upset.

On the No. 7 seed line, there is a very clear upset recommendation: No. 10 Rutgers to take out No. 7 Clemson. The current Vegas line also has the Scarlet Knights favored, so this is perhaps the best upset pick of the entire first round. As for the other games, Kenpom has the Oregon/VCU game as the next most likely upset pick, but the Vegas line suggests otherwise. The other two games do not inspire me to select an upset winner, but it would not shock me to see Maryland beat UCONN.

Finally, the No. 8/No. 9 games are always a coin toss and the Kenpom data and Vegas lines are just causing confusion. Based on Figure 5 (from Kenpom) taking Wisconsin and Saint Bonaventure as “upset winners” makes the most sense. But, the Vegas lines favor the No. 8 seeds across the board, so this is a tough call. Personally, I like Wisconsin and Georgia Tech to win as No. 9 seeds, but that is more of a gut feeling.

In summary, I like the following first round upset picks

  • No. 14 Abilene Christian over No. 3 Texas
  • No. 13 Ohio over No. 4 Virginia
  • No. 12 Georgetown over No. 5 Colorado
  • No. 11 Utah State over No. 6 Texas Tech
  • No. 11 MSU over No. 6 BYU (total homer pick)
  • No. 10 Rutgers over No. 7 Clemson
  • No. 9 Wisconsin over No. 8 UNC
  • No. 9 Georgia Tech over No. 8 Loyola-Chicago

As I demonstrated in part one of this series, eight total first round upsets is exactly on the historical average.

Picking the Second, Third, and Fourth Rounds

An analysis of the second, third, and fourth round games can be done using the same general strategy. Figure 6 below compares the odds for each projected match-up, relative to historical averages. In this case, I still make the comparison to the seed combinations assuming that the higher seeds won in the first round.

Figure 6: Projected upset odds for the second (left panel), third, and fourth round (right panel) seed combination relative to the historical averages.

The seed combinations for the second round games are shown in left panel. Based on this analysis, this is where the Madness of March is likely to rear its beautiful head. As discussed above, all of the No. 1 seeds, with the exception of Gonzaga, may have a problem in the second round. Based on this analysis, Illinois is the team that is most likely to lose.

That said, Illinois is playing pretty well right now, I am not convinced that Loyola is as good as Kenpom suggests, and I think Illinois can handle Georgia Tech. If a No. 1 seed is to lose in the second round, I think the either Baylor (with the program’s lingering COVID issues) or Michigan (with its injury situation) are most likely. In my bracket, I have senior-laden Wisconsin knocking out Baylor, and Michigan barely escaping LSU.

Figure 6 above suggest to me that we will see multiple upsets of No. 2 and No. 3 seeds in the second round in 2021. History suggests that one No. 2 seed usually fails to make it to the second weekend and this analysis suggests that Alabama is the most likely victim in 2021. I like the winner of the UCONN/Maryland game a lot in this upset.

As for the No. 3 seeds, this analysis suggests that they all might fail to make the Sweet 16. In my bracket, I have already knocked Texas out in the first round, and I think West Virginia and Kansas will follow in the second round. The only saving grace is that I have Arkansas facing No. 11 Utah State instead of No. 6 Texas Tech, otherwise, I would knock out the Razorbacks as well.

The No. 4 No. 5 game is basically also a coin toss, but is this case, I already have No. 4 Virginia and No. 5 Colorado eliminated, so No. 5 Creighton and No. 4 Florida State should advance easily over the double-digit opponents. I don’t see short-handed No. 5 Villanova beating No. 4 Purdue, so that just leaves No. 4 Oklahoma State versus No. 5 Tennessee. Kenpom actually projected that the Volunteers would be favored in this matchup, so that is my pick as well.

In summary, my recommended second round upsets are:

  • No. 9 Wisconsin over No. 1 Baylor
  • No. 7 UCONN over No. 2 Alabama
  • No. 6 USC over No. 3 Kansas
  • No. 6 San Diego State over No. 3 West Virginia
  • No. 5 Tennessee over No. 4 Oklahoma State

Once again, I am hitting the average number of upsets in round two (five) exactly on the nose. I should also note that in my bracket, I have No. 11 Michigan State taking care of upstart No. 14 Abilene Christian to make it to the Sweet 16 where the Spartans would face No. 7 seed UCONN. You’re welcome.

Turning now to the Sweet 16 and Elite Eight, if the brackets went all chalk, the right panel of Figure 6 shows the relative odds for each matchup. Basically, once we get to the Sweet 16, the 2021 tournament is expected to be better behaved.

If Baylor were to make it this far, the figure suggests that Purdue would have the best chance to eliminate a No. 1 seed. Since I already knocked out the Bears, I have No. 4 Purdue beating No. 9 Wisconsin to reach the regional final.

The math suggests that a second No. 1 seed often loses in the Sweet 16 round and as Figure 6 suggests, the next most likely upset is for No. 4 Florida State to eliminate the Wolverines. The remaining Sweet 16 games would project to go to the highest remaining seed (including MSU losing and Ohio State winning), but I will throw a small veto into the math. I have No. 6 USC upsetting No. 2 Iowa.

In my bracket, this leaves:

  • West Regional Final: No. 1 Gonzaga versus No. 6 USC
  • South Regional Final: No. 2 Ohio State versus No. 4 Purdue
  • Midwest Regional Final: No. 1 Illinois versus No. 2 Houston
  • East Regional Final: No. 4 Florida State versus No. 7 UCONN

In general, I think that chalk would prevail here, as is often the case. But, the closest call is the Midwest region final, where Houston would only be a slight underdog to the Illini. As I mentioned in part one, Kenpom data suggests that the three most likely champions are Gonzaga, Illinois, and Houston, and I think that the National Title game will most likely involve two of those three teams.

As for me, I still like the Illini to advance and beat Ohio State (again) in the Final Four. I then see Gonzaga beating the Illini is an exciting finish to the 2021 season.

So, there you have it. Based on all the available data, when I turn the mathematical crank, that is the answer that I get. That said, any model and analysis is only as good as the data that feeds it. This year, we have less data than usual, and we simply don’t know what we don’t know.

The last time we had a NCAA Tournament, a similar analysis made some very accurate predictions. There is no guarantee that will happen again in 2021. So, please take everything above with a grain of salt. As I stated in part one, the Madness is predictable, on average. However, 2021 is no average year. But, the Madness is back, and that is all that matters.