SSAC Thoughts: Three More Questions for Analytics

The Sloan Sports Analytics Conference is invariably a great place to exercise one's basketball brain, and this year's conference was no exception.   Kevin Pelton got the conversation started before the conference was even underway with his outstanding article addressing his 10 biggest questions for basketball analytics going forward.  All of Pelton's choices were extremely worthy nominees (far be it for me to quibble with conference star Stan Van Gundy, who termed the article "excellent"), but the conference panels and my discussions throughout brought me to three big questions I would add to his list.* Warning: Hardcore Nerd Content Ahead.

* It should be noted it is very likely that many teams have studied some if not all of these aspects, but sadly we are limited to discussing what is in the public domain at the moment.  


HOW DOES PERFORMANCE VARY BY CONTEXT?

One of the key purposes of sports analytics is to measure performance.  Teams, coaches, players, writers, and fans want to be able to know how well players and teams are doing, how well they are likely to do in the future, and what needs to be improved.   A key frontier in these efforts is context dependent statistics.  As Blazer’s coach Terry Stotts noted in this podcast, basketball is perhaps the most context dependent of the major American sports due to the high frequency of scoring and the quickness with which teams change ends.  

*Said Stotts:  "One thing I think is very difficult for basketball in comparison to football and baseball is those two sports have static events. There's an at-bat and something happens on that at-bat and you can quantify each at-bat and same thing with football with, even though there's 11 players on the field, you have a down and yardage each time and it's a very static event. Basketball is a free-flowing, with different matchups and different set of circumstances, perhaps every time down the floor.”

Take team defense as an example.  The current standard for measuring defense is the defensive efficiency statistic, which measures a team's points allowed per 100 possessions.  But, most would acknowledge that it is far easier to get a stop after a made basket or a dead ball situation than after a live ball turnover or a missed field goal.  As a general proposition, it is harder to defend when a team has a bad offense.  Similarly, teams that force more misses and (especially) turnovers should have an easier time of it on offense.  While complicated, it should be possible to arrive at adjusted defensive and offensive rankings based on the situation a team faces each time down the court.

The universe of context dependent statistics seems nearly limitless.  Kirk Goldsberry (who submitted a fine paper on interior defense at Sloan) examined one such metric in his article on the Kobe Assist, in which he found that Bryant's misses were more valuable than average because teammates were able to rebound a higher percentage of his missed shots than normal.  John Huizenga and Sandy Weil, in their seminal paper evaluating the Hot Hand, also looked at how likely teams were to score depending on how they got the ball.*

*For some reason I could not find a copy of this paper online anymore, as it seems like their site has been taken down.

State of the art player evaluation statistics may soon move well beyond even advanced metrics such as PER to credit players for their types of misses as well as makes, or for blocking a high value close shot as opposed to an inefficient midrange jumper.  For another example, steals are credited with simply the value of a possession by PER, when research shows that possession off a steal usually leads to a more efficient subsequent possession.  New metrics could account for this context as well.*

*It seems almost certain that savvy teams like the Rockets are already doing this.  Rockets GM Darryl Morey let slip during one of his panels (in what was admittedly a hypothetical) that a player's steal could be worth as much as 3 points, presumably from the value of stopping the opposition and leading to a sure basket.

Thus, while basketball people have long understood that the outcome of a shot or possession is intrinsically related to what came before, publicly available analytics have only reached the tip of the iceberg in accounting for context.



WHAT NEW DATA IS PREDICTIVE OF PLAYER AND TEAM PERFORMANCE?



New analytics could predict Thomas Robinson's chances to do more of this in future years

My analysis of Thomas Robinson's struggles this year brought me to the question of how likely he is to improve his interior finishing.  While I posited that it seemed unlikely to improve significantly due to the fact that finishing seems more dependent on innate abilities than learned skills, I do not know whether this is true.  However, we now have shot location data going back approximately 10 years to potentially answer how players' interior shooting develops.  In a more macro sense, all of the new data being developed could eventually be plugged into player and team projection systems.  These could include player shooting effectiveness from different distances (or even specific spots on the floor), number of dunks, points per possession on postups, on/off court effectiveness, weight gain over the course of a career, vertical leap, wingspan, ad infinitum.  With the advent of Stats Inc.'s SportVU player tracking system,*  even more metrics may soon be available, such as shooting percentage off the dribble, defensive misses forced (as outlined in Goldsberry's paper the Dwight Effect), or shooting percentage on contested shots to name only a few.  

*The SportVu system, a camera system which tracks the location of all players and the ball 25 times per second, was one of the stars of the Sloan conference for the second straight year, with three of the most interesting papers based on its data.  A future article will examine the utility of the SportVu system and what useful metrics analysts might glean from it.

Although any model including too many variables runs the risk of overfitting if the metric is too divorced from a rational relationship to player performance, nearly any of these new metrics could be tested for statistical significance in projecting development or team performance.  If Thomas Robinson is a poor offensive player because of his interior finishing, knowing whether that could improve would prove invaluable in determining whether to deal him, keep him, or sign him to a lucrative extension.


WHAT IS THE OPTIMAL PLAYER USAGE TO PREVENT INJURIES AND DECREASED PERFORMANCE DUE TO FATIGUE?

I first took interest in the argument over minutes played last year in response to Henry Abbott's study which found that teams with a player who exceeded 3000 minutes per year had not won the championship since 2004.  While this was certainly a worthy study, it was my opinion that the data proved correlation rather than causation.  That is to say, giving star players lots of minutes likely correlated with championship disappointment for other reasons (such as that truly elite teams would have more blowouts and need to play stars less) rather than actually causing the team to play worse in the playoffs.  Since then, I have constantly debated the Bulls Twitter ballosphere on the topic of high minutes accumulated by the Bulls' stars under Tom Thibodeau.  While it is considered basketball gospel that "excessive" minutes either cause players to break down or play worse, nobody has publicly released an evidence showing this is actually the case.*  

* The closest I have seen is Pelton's research examining fatigue and ACL injuries.



Luol Deng is at the forefront of the minutes debate

The first question is what kind of a study could "prove" this is indeed the case.  At the outset, it would seem that there are three times spans in which more minutes could excessively affect a player:  During a game, during a season, and over the course of a career.  Additionally, a player might suffer the effects of excessive minutes in one of two ways, either through reduced effectiveness as the game/season/career goes on, or through acute injury and missed games.  So, six separate studies would need to be undertaken to definitively answer these questions:

1.  Are players more likely to get hurt the longer they play in an individual game?  This could be measured in much the same way Pelton examined the ACL tears, except it would include all acute injuries that caused a player to a) leave the game or b) miss the next game.*  There may also be some value in parsing out the types of injuries.  For example, an injury such as being hit in the head would not seem to be dependent on being tired (although perhaps the study would show that it is!), whereas a pulled hamstring or sprained ankle would seem more likely to be the result of exhaustion.

*Another small subset of this study could examine whether players break down in the games following a high-minute game.

2.  Are players more likely to get hurt late in a season in which they have played heavy minutes?  This would simply look at the injury risk for each additional minute played during the season and whether it rises.  Another way of examining this would be to look at the number of high minute GAMES accumulated during the season.  It could be that individual high minute games are detrimental, much as baseball research has shown that high pitch counts in certain games are perhaps more predictive of injuries than overall number of pitches or innings thrown throughout the season.

3.  Are players more likely to get injured over the course of their career as they accumulate a) more total minutes, b) more heavy minute seasons, or c) more heavy minute games?  It should be noted that all of these studies would have to be normalized for the player's age, as we do know that young players tend to get hurt less often than older players.

Finally, each of these three studies could be repeated to determine whether a player suffered reduced effectiveness through whatever measurement is preferred, be it PER, WARP, or even much more subtle parsing such as rebound rate.  *

*In addition to the potential of overall reduced effectiveness, perhaps research would also show that rebound rate decreases as minutes go up, but field goal percentage on jumpers remains constant.  This could be a reason to, say, avoid playing big men too many minutes while taking a less restrictive approach with guards.

Finally, if these studies tended to indicate that heavy minute players are more likely to get injured or lose effectiveness over a given time horizon, another question to consider is whether those players get injured more often on a per minute basis as their minutes increase, or whether the increased likelihood of injury is simply the result of being out there more often.  I would posit that heavy minute players actually get injured LESS often on a per minute basis than reserves.  I think this is due to the fact that reserves are often coming off the bench without a good warmup,* and because there is selection bias among the sample of heavy minute players.  Essentially, heavy minute players are just that because they've shown they can handle it.

*Another interesting issue is whether NBA teams should provide some mechanism for players to warm up in the bench area before they are due to enter the game.  Dennis Rodman famously rode an exercise bike to stay warm when he came off the bench at times for the Bulls in the 90s, but I am unaware of other players trying this approach.  I also think that teams might investigate giving players shorter but more frequent rests, which may better correlate with actual recovery times.


For the second straight year, the Sloan Conference expanded my intellectual horizons and inspired a whole host of article topics.  I hope to examine these further in the coming weeks.