Following the introduction of high-resolution player tracking technology, a new range of statistical analysis has emerged in sports, specifically in basketball. However, such high-dimensional data are often challenging for statistical inference and decision making. In this article we employ a state-of-the-art Bayesian mixture model that allows the estimation of heterogeneous intrinsic dimension (ID) within a dataset, and we propose some theoretical enhancements. Informally, the ID can be seen as an indicator of complexity and dependence of the data at hand, and it is usually assumed unique. Our method provides the capacity to reveal valuable insights about the hidden dynamics of sports interactions in space and time which helps to translate complex patterns into more coherent statistics. The application of this technique is illustrated using NBA basketball players’ tracking data, allowing effective classification and clustering. In movement data the analysis identified key stages of offensive actions, such as creating space for passing, preparation/shooting, and following through which are relevant for invasion sports. We found that the ID value spikes, reaching a peak between four and eight seconds in the offensive part of the court, after which it declines. In shot charts we obtained groups of shots that produce substantially higher and lower successes. Overall, game-winners tend to have a larger intrinsic dimension, indicative of greater unpredictability and unique shot placements. Similarly, we found higher ID values in plays when the score margin is smaller rather than larger. The exploitation of these results can bring clear strategic advantages in sports games.
This research was supported by the Australian Research Council (ARC) Laureate Fellowship Program, the Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), and the project “Bayesian Learning for Decision Making in the Big Data Era” (ID: FL150100150), First Investigator: Prof. Kerrie Mengersen. We thank Wade Hobbs and Timothy Macuga for their comments and suggestions. During the development of this article, F. Denti was funded as a postdoctoral scholar by the NIH grant R01MH115697 grant. Previously, he was also supported as a Ph.D. student by University of Milano—Bicocca, Milan, Italy, and Università della Svizzera italiana, Lugano, Switzerland.
We thank the two anonymous reviewers, the Editor, and the Associate Editor for carefully reading the manuscript and their insightful and constructive comments. These helped to improve the manuscript substantially. All computations and visualizations were carried using R using the packages mcclust (Fritsch (2012)), superheat (Barter and Yu (2017)), tidyverse (Wickham (2017)), gganimate (Pedersen and Robinson (2019)) and ggrepel (Slowikowski (2019)).
"The role of intrinsic dimension in high-resolution player tracking data—Insights in basketball." Ann. Appl. Stat. 16 (1) 326 - 348, March 2022. https://doi.org/10.1214/21-AOAS1506