If you want to read from the start of the series, check out last week’s column on Week One:
The quick background for anyone new is that I’ve built a machine learning (ML) model to predict the winners of NFL games. The model utilizes about 400 data points per game and is trained on 12 seasons of NFL games (~3,000 games).
Over time, in addition to assessing the results and the picks being made, will provide more background about the data pipelines and model design over time. We’ll do that here upfront before getting to the recap from last week and the picks for next week.
So if you want to hear a bit more about the data and process of modeling, read on. If not, skip ahead to the Week One Recap.
The Data
We’ll start at the top of the pipeline: what data are we using to build this model?
The data used largely comes from four sources:
Pro Football Reference (www.pro-football-reference.com)
Sportsbook Review Online (www.sportsbookreviewsonline.com)
538 ELO Data (data.fivethirtyeight.com)
The nfl_data_py python package (pip install nfl_data_py)
Now, I’ve already had some issues between last year and this year:
Sportsbook Review (which I use for historical gambling line data) has stopped providing that historical data.
FiveThirtyEight has been largely cut by Disney and thus no one is maintaining their ELO dataset for 2023.
So… that’s fun. The first I can/have gotten around by keeping my own line data for 2023, which isn’t too bad to do. But the second becomes more complicated. For now, I’ve removed that data from the model, and I’ll consider some way to replace it whenever I get a chance to do another round of broader data enhancements.
I’ve processed this data in Databricks for a few reasons, and have been able to largely fit everything within their free Community Edition. The best part is that I can then do SQL for the data pipelines and Python for the modeling all in one spot.
Broadly (and there will be more to say about this next time) I am loading each data feed into its own bronze level table that stores the raw data, and then create several persisted silver level data sets that sit atop those. From there, we do some manipulations and create many derived fields before moving on to the modeling.
I have master scripts then that, assuming the data is there, update all the bronze level tables and any data that sits atop it. But again, more on that next time! For now, let’s see how this all worked out in week one:
Week One Recap
It’s worth noting here that in addition to the picks we make here, I am entering picks from this model in a season-long pick’em contest that I run. I entered these under the name Al Mumma, which I figured was a fun shortening of Algorithm that also looks like “AI”.
As such, we’re going to personify the picks here from Al. In this column, Al picked six underdogs and went a very respectable 3-3 given the deep dog strategy:
Detroit Lions (+190) - WIN
Indianapolis Colts (+184)
Tampa Bay Buccaneers (+220) - WIN
Houston Texans (+398)
Arizona Cardinals (+270)
LA Rams (+199) - WIN
Using our financial positioning of betting $100 per game, Al put down $600 and was up $309 on top of that for a very respectable 52% ROI!
Assuming we started with at $1,000 bankroll, we’re now at $1,309.
I should add that in the pick’em contest the incentives are slightly different and Al took the following:
Titans - Loss
Jaguars - Win
Steelers - Loss
Packers - Win (bonus for underdog pick)
Dolphins - Win (bonus for underdog pick)
Ravens (Key Pick) - Win (worth double as key pick)
This put Al in a tie for 2nd place out of 15 people. It’s still early, but this can be another measure of Al’s success.
But for now, we’ll have to see if this success continues in week two:
Week Two Picks
Al is seeing a lot of mismatched lines in week two!
Of course, here is where we are going to start to see some challenges with an algorithmic approach.
Al LOVES the Jets this week against the Cowboys. But as I have no player level or injury data in my model, it has no idea that Aaron Rodgers is out for the year. All it sees in the Jets is a 1-0 team that beat the Bills, who were a good team last year.
And sadly, Al doesn’t even get to enjoy the memes that resulted from such a sad injury!
With that said, here are the picks from Al:
I’ll use current money lines (as of Tuesday afternoon) here for the picks:
Vikings +263 over the Eagles
Raiders +380 over the Bills
Seahawks +232 over the Lions
Chiefs -153 over the Jaguars
Bears +142 over the Buccaneers
Cardinals +204 over the Giants
Rams +308 over the 49ers
Jets +398 over the Cowboys
So that is 8 picks, thus putting $800 (fake) dollars on the line for week two. Taking ~60% of your bankroll in one week (and taking a bunch of underdogs with it) isn’t sound portfolio management when it comes to gambling, but let’s roll with it.
Just don’t do that at home.
For my take, Al has some questionable picks here, certainly not ones that I would make! The Jets pick is quite likely a loss, and it is also hard to believe the Bills will start 0-2.
That said, many of these picks do a good job of not overreacting to week one. Teams like the Vikings, Bears, and Seahawks scream “bounceback” assuming they aren’t as bad as they looked in week one. And the Rams and Cardinals picks? I can honestly see the value there as the Rams always play the 49ers tough and the Giants may just be really bad. I’d still say the Giants and 49ers are going to win, but for a 2x or 3x return? It could be worth a shot!