The NBA season is about to start and so I thought I'd write a blog post detailing my adventure setting estimated win totals for the upcoming season.
This is a fun project because it gets you thinking in a disciplined way about the prospects of your favourite team and others.
An added bonus is addition of an element of competition, since the odds makers are happy and eager to publish their own results from their much more sophisticated (but not necessarily exceedingly more accurate) models.
My basic approach is to sum a forecast of a team’s adjusted +/- into an aggregate total, then use Daryl Morey’s application of Bill James’ Pythagorean Wins estimate to determine how many wins each team will come away with over the season's 82 games.
This approach isn’t perfect because +/- is built at the player-level rather than the team (unlike metrics like ORtg and DRtg), but it certainly serves my purpose of thinking more rigorously about basketball and so I figure the concept has some use.
There’s plenty of different ways to go about this, so I won’t get too bogged down in the details. But the basic approach I followed is:
1. I pulled out the starting 5 and first sub at each position from ESPN Depth Charts:
3. I projected this year’s values by inflating T-1 BPM for each player by (admittedly arbitrary) Talent, Coach, and Culture adjustments. Rookies were generally given a value of -2.0, which is the value of a replacement-level player. Ad hoc adjustments were made in a few cases where I thought additional adjustment was necessary (ie Wemby).
4. I built a minutes forecast primarily on T-1 data (with some adjustments for injuries, load management, etc), when normalized it across the 10 players so that the total minutes played = 240 per game.
5. Multiplying the forecast OBPM and DPBM values by the forecast minutes gave me an aggregate adjusted sum. Adjusting this by pace (to take the per 100 possessions BPM metrics and provide a per-game level output; mostly I just use the T-1 value) and adding the league average score (here I relied on the T-1) gives me a per game value for points for (using OBPM) and points against (DBPM).
6. Plugging points for and against into the Morey/James Pythagorean formula provides win expectations. I also normalized the Pythagorean results to equal the NBA season’s 1,230 total wins (otherwise I’d have more wins and losses than the schedule allows). This isn’t an ideal step, but the alternative is to use Excel’s Goal Seek to set the exponent to a number that provides the proper total. This can be done pretty straightforwardly, but for now I don’t want to depart too far from Morey’s 13.91 value (though already 538 and others depart from it slightly).
7. Comparing these win totals against the Over/Unders set by the sharps gives a good target to aim at:
A few thoughts that came out of this process:
* It would sure be great if all the different data sources (NBA, Basketball Reference, ESPN, etc), use the same nomenclature (eg PHI vs PHL). Data cleaning is a pain, especially in a spreadsheet rather than coding (like Stata) environment.
* Bill James’ Pythagorean tool is incredibly handy. And shout-out to Wayne Winston for making all of this business understandable.
* Working through the numbers helps you think clearly about simple applications of the model. One of my favourite revelations is how a slower pace makes sense as a *strategy* for bad teams. If things go sideways every time you touch the ball, don’t chuck but clog things up and try to stop the bleeding.
* BPM is -2.0 for a replacement level player. The implication is that this is a star-heavy league, where top talent is *expected* to to light you up--but you can still stay in the league if you limit the damage.
* I clearly don’t think enough about how minutes profiles matter. Tinkering with the model shows that if you take away minutes from even an incredibly stacked team and you have real problems. This is also a good way to think about how young can outperform old, how we probably don’t assign enough weight to how catastrophic injuries are to team’s fortune, etc. Good minutes aren’t everything. But they’re almost.
In any case, if you have a bit of time, interest in stats, and love of basketball, this is a good project for you to try out on your own.