In a normal year, I look forward to the month of July as a time to start focusing on the upcoming college football season. I start looking at MSU’s roster, and I start to update my football spreadsheets and databases in preparation to perform both preseason and in-season analyses.
In a normal year, if I were to write a story containing the title “hopelessly optimistic” it would likely be focused on MSU’s odds to win the Big Ten or make the College Football Playoffs. While I still plan to cover both of those topics eventually, my hopeless optimism in 2020 refers simply to the hope that we get any college football at all.
At this point, we still do not know how this all going to play out. Right now, the odds of the full schedule being played (starting, for example, in the spring of 2021) seem extremely low. The odds of a shorter schedule in the fall are better, but still may not be great. The Big Ten and a few other conferences have already announced that they will only play a conference schedule, but we still do not know what that is going to look like. The future is very much unknown, and I have no way to simulate the probability of any of those scenarios.
So, as we wait for this all to shake out, I thought that I would provide an update on some of the things that I have been working on and what kind of analysis that I plan to provide once we get back to playing football again, whenever that may be. For now, I am going to present the results of some of my usual preseason football analysis, doing so with the full knowledge that that version of the season almost certainly will never happen. I believe that there still are a few things to be learned.
My approach to analyzing and simulating the college football season hinges on a power ranking system that I developed over 10 years ago and have been refining ever since. It is honestly a very simple algorithm which uses only the final score and location (home, away, or neutral) of each game as its input, but which after a few weeks of data, can predict Vegas spread with surprising accuracy.
I used this power ranking system last fall in my “Bad Betting Advice” series. For the entire season, my metric outperformed ESPN’s FPI against the spread (54 vs. 48 percent). It hasn’t done that well every year overall, but I have developed a rule of thumb that allows me to make selected picks against the spread. This strategy consistently performs at a level of 55 percent (a record of 578-475) against the spread over a period of 10 years.
The other key component of my methodology is the ability to use either my power rankings or the Vegas line to calculate the probability that the favored team will win straight-up. This effectively gives me the ability to assign a probability to any and all games on the schedule, including potential conference championship games, playoffs games, and other bowl games. Once I have these probabilities, I can use a random number generator to set up a Monte Carlo simulation of any and all potential match-ups for an entire season.
As the season progresses, my power ranking accumulates enough data that it becomes self-consistent and requires no other external input. However, before the season starts, I must rely on various preseason rankings in order to seed my preseason and early season calculations and simulations. In a typical year, I gather preseason rankings of all 130 FBS teams from several sources including Athlons, Lindy’s, Phil Steele, ESPN’s FPI, and the S&P+. I take the average ranking of each team from all the sources in order to generate a “consensus” ranking for each team. Over the years of collecting this data, I have learned a few things.
Grading the Preseason Rankings
An obvious question to ask is, “how accurate are the preseason rankings?” One way that I have attempted to answer this is to look at the final rankings of all 130 teams (as measured by my power rankings) and compare them to the preseason consensus rankings. Table 1 below summarizes this average difference for each of the sources that I used for each of the past eight seasons. For every year I highlighted the source that showed the best performance.
Phil Steele always claims that his rankings are the best, and based on this analysis, I can’t really argue. His predictions came the closest to my final rankings in five of the eight years that I measured, and in two of the remaining three years, Steele was only edged by a tenth of a point. In general though, there is not a big difference between the accuracy of Phil Steele’s rankings and everyone else’s rankings.
The biggest takeaway from Table 1 is the shear magnitude of the deviations. On average, the preseason rankings are off by about 17 spots, when averaged over the entire field of FBS teams. Looking at this another way, Figure 1 below presents the entire data set (2012 to 2019) for just Phil Steele in histogram form.
The distribution does not appear to be Gaussian, but the standard deviation calculates to 22.5. Just to throw out a few more numbers, Phil Steele gets the rankings “correct” within 10 slots about 40 percent of the time and within 20 slots about 66 percent of the time. The other side of the coin is that a full third of the teams in any given preseason magazine power ranking will end the season at least 20 above or below where they are on the list in August. As it turns out, understanding this uncertainty is important if one want to construct an accurate model of the season.
Heading off to Monte Carlo
I am certainly not the only nerd out there who likes to use probabilities to simulate the college football season. Most notably, ESPN’s FPI website has a table that gives their probabilities that each team will win their division, conference, make the playoff, etc. They currently give Clemson a 36 percent chance to win the National Title. Ohio State and Alabama have a 19 and 18 percent chance respectively. As for MSU, the FPI gives the Spartans a 0.4 percent chance of just getting to six wins. ESPN lists MSU’s odds for anything better than that to be zero.
If that depresses you, don’t worry. It’s also not true. Well, I suppose that it might be true, from a certain point of view. That point of view is the one where the FPI’s preseason rankings are correct for all teams. But, I have already shown that they are not. The preseason rankings themselves have a variance that is measurable, and if a model does not take that into account, it is not going to give an accurate result. I would argue that the FPI overestimates the odds of highly ranked teams and underestimates the odds of lower ranked teams. They have no where to go but down (or up) and odds are some of them will.
In my model, I accounted for this variance for first analyzing the relationship between the preseason rankings and the average year-end power index that I calculate for each team. My power indices are the raw scores for each team that I use for my power ranking. Each team earns a performance score for each game, based on the final score and strength of the opponent. The straight average of those performance scores for all completed games is the power index for that team. The magnitude of the power indices are arbitrary (it is the difference between them which matters), but I set them such that the average for all teams is around two. That correlation between the preseason rankings and the final power indices is shown below in Figure 2.
The data is pretty noisy (high variance) but the average follows a pretty noticeable trend. As the preseason rank decreases, the power index also decreases. The relationship is fairly linear in the middle of the graph, but it also tails up for the very good teams (roughly the Top 10) and down for the very bad teams (roughly the bottom 10). The standard deviation varies between about 0.1 and 0.2.
I took this data and fit it to a third order polynomial. I then took the average of the standard deviations and got a value of 0.143. Using this information, I generated a best-fit as shown below in Figure 3.
As you can see, this figure captures well the shape and variance of the real data. I was now ready to set up the full season Monte Carlo simulation, which is essentially a simulation within a simulation. For each cycle, I first used the data presented in Figure 3 to fix the real/final power index for each team in that cycle, using the 2020 consensus power rankings and a set of random number. Some teams project to be better than expected, while other project to be worse. This calculation fixes the point spreads for all regular season games in a given cycle.
I then used another series of random number generators to pick the winner of each game. As an added feature this year, I also went through the logic of the conference tiebreakers and created a rough algorithm to select the four Playoff participants (based on the power rankings of any conference champions and other 11-win or 12-wins teams.) This allowed me to simulate a complete season, all the way to the National Champion. Each cycle (full season) takes about one second to calculate.
Just to give an example, MSU’s consensus ranking is No. 50 in the preseason. But, maybe MSU finds a quarterback, the offensive line is healthy and effective, Elijah Collins has a breakout sophomore year, and a few new stars on the defense emerge. Maybe MSU is actually the 20th best team in the country and goes 9-3, good for second place in the Big Ten East.
Maybe the offense struggles and MSU is actually only the 73rd best team in the country. But, MSU steals a few upset wins over Iowa and Minnesota to finish 7-5. Or, maybe the Spartans are slightly better at No. 69 overall, yet can only eke out five wins total and a 2-7 Big Ten campaign. Those are three consecutive outcomes that my simulation just spit out for me as I typed this. All seem feasible. I can estimate the probability of each similar outcome.
This year, due to various COVID-19 related issues, I was only able to get the preseason rankings from ESPN’s FPI, Athlons, and Lindy’s. While it would have been better to have Phil Steele as well, Table 1 above shows that overall, all sources have a very similar level of error. The season that I am simulating will likely never happen in its entirety anyway, so I am treating this mostly as an exercise to test my new methodology. At the same time, I think that it tells us a few things about how to set expectations based on the preseason rankings.
Initial Season Simulation Results
Based on the ESPN FPI website, they run their simulation for 20,000 cycles. I decided to let my home desktop crank on my simulation for a bit longer: a full one million cycles. (It took my computer about a week and a half, broken into ~12 hours segments). But, I now have a very good set of reference data.
This simulation allows me to calculate the odds for all sorts of outcomes, including the odds for each team to win its division and/or conference, make the playoffs, and win the National Title. I plan to dig deeper into those results in a later post. For now, here is the curve that I generated for the odds to make the playoffs based on the preseason consensus ranking.
Based on my calculations, the consensus No. 1 team for 2020 (Clemson) has about a 40 percent chance to make the playoffs. Note that this is lower than the FPI prediction of 81.2 percent (which is too high, in my opinion). But, the best fit line, suggests that that a generic consensus No. 1 team would only have about a 34 percent chance. Clemson’s higher percentage this year is function of their relatively easy schedule compared to other Power Five teams.
In the current era, it seems like that the “real” odds are likely somewhere between my prediction of 40 percent and the FPI’s prediction of 80 percent. My correlations essentially assume that the baseline for each team is a historically average team of that strength. It is safe to say that teams like Clemson, Alabama, and maybe Ohio State are likely all a bit better than an average team of their preseason rank, historically. If that is true, my numbers are a bit low.
In any event, this figure gives us a good rule of thumb to estimate any team’s odds to make the Playoffs based simply on their consensus preseason rank. Teams in the Top-five have a roughly 30 percent chance to make the playoffs. For the team ranked No. 10, the odds fall to around 17 percent. The odds for the team ranked 25th drop to just four percent.
The FPI essentially stops there, but my calculations suggests that the curve extends down smoothly almost all the way to the worst team in the FBS. I was able to fit the data to a simple exponential function and if I plot the data on a log-plot, we can see how good the fit is. The correlation is shown below in Figure 5.
The correlation remains very good all the way down to the team ranked around No. 100, where it appears to fall off. For the teams ranked right around the century mark, my simulation did observe around 10 berths into the college football playoff over the one million simulations. After that, however, it drops to zero. So, I think that it is safe to say that for teams such as San Jose State (ranked No. 111), Rice (No. 114), and Northern Illinois (No. 112) have less than a one-in-a-million chance to make the College Football Playoff.
With this introduction in place, it will soon be time to take a look at the Big Ten in more detail. As I said before, the details will almost certainly be moot, but I think that we can still learn something through the analysis. Who knows, maybe a vaccine will be discovered tomorrow and leagues will cut a deal to start a full football season in March. The odds of that happening feel better than Utah State’s playoffs odds. Sorry, Aggies....
Until next time, stay tuned, and Go Green.