*Data courtesy of Puckalytics.com – encompasses all F/D seasons in which they play 400+ 5v5 minutes (2007-08 to 2014-15).*

**INTRODUCTION**

A common goal in modern hockey statistics is to find repeating patterns that best predict future results. In other words, you want two things: repeatability and predictive ability. Understanding the importance of these two concepts has led to the popularization of shot attempt based measures like Corsi and Fenwick. Consequently, on-ice percentages have often times been deemed useless at the player level due to their lack of repeatability. Therein lies the inefficiency. Just because a metric is susceptible to high levels of variance doesn’t necessarily warrant its complete dismissal. In the case of shooting percentage and save percentage, it comes down to regressing and extracting whatever useful information we can. The question is, how far do we regress? The point of this piece is to answer that exact question and show how we can regress percentages to markedly improve on the predictive ability of Corsi at the player level.

**REGRESSING ON-ICE CORSI SHOOTING PERCENTAGE (FORWARDS)**

The idea is to use a method that regresses on-ice Corsi shooting percentage dependant on sample size. To do so we need to determine the number of shot attempts (at league average CSh%) that need to be added to a player’s existing sample in order to produce a regressed CSh% that, when multiplied with Corsi For / 60 (CF60), best predicts future Goals For / 60 (GF60). In this case, “Future GF60” will be a player’s GF60 in his subsequent season.

The chart below shows how the predictive ability of regressed Goals For / 60 (rGF60 = regressed CSh% * CF60) changes as varying amounts of shot attempts (at league average CSh%) are added to a forward’s sample…

The vertex at (1100, 0.5030) shows where the predictive value of regressed GF60 peaks. This means that in order to best predict future goals, we need to add 1100 Corsi For and 46.75 Goals For (1100 CF * league average conversion rate of 4.25% = 46.75 GF) to a forward’s existing sample.

The following graph demonstrates how far a forward’s CSh% is regressed to the mean when adding 1100 CF and 46.75 GF…

*Notice how a forward’s actual on-ice Corsi shooting % carries more weight as his sample size increases.*

How rGF60 (regressed CSh% * CF60) compares to CF60 and GF60 when predicting future goals…

**REGRESSING ON-ICE CORSI SHOOTING PERCENTAGE (DEFENSEMEN)**

The same method can also be used to regress on-ice Corsi shooting percentage for defensemen…

In this case, the predictive ability of rGF60 peaks at the point (2900, 0.2480). So in other words, when trying to best predict future GF60, you need to add 2900 Corsi For and 123.25 Goals For (2900 CF * league average conversion rate of 4.25% = 123.25 GF) to a defenseman’s existing sample.

The following chart demonstrates how adding 2900 CF and 123.25 GF regresses a defenseman’s observable CSh% to the mean (vs. adding 1100 CF and 46.75 GF for forwards)…

How rGF60 (regressed CSh% * CF60) compares to CF60 and GF60 when predicting future GF60 for defensemen…

**REGRESSING ON-ICE CORSI SAVE PERCENTAGE (FORWARDS + DEFENSEMEN)**

Performing the same test on on-ice CSv% does result in a slight bump in predictive ability for both forwards and defensemen. However, persistence in goaltending is likely responsible for that small increase. Because of this, on-ice Corsi Sv%s are regressed 100% to the mean for both forwards and defensemen.

**CALCULATING rGF% (FORWARDS + DEFENSEMEN)**

*Click image to enlarge

**TESTING THE PREDICTIVE ABILITY OF GF%, CF% & rGF%**

Forwards…

Defensemen…

The predictive ability of rGF% is a marked improvement over CF% for forwards – the same cannot be said for defensemen. This makes sense given the fact that on-ice shooting percentages are generally more repeatable for forwards.

Excel spreadsheet containing all rGF60, rGA60, rGF% data from 2007-08 to 2014-15