Friday 20 June 2014

Puckerings archive: Estimating Ice Time (14 Mar 2001)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on March 14, 2001 and was updated on November 26, 2002.
 

Estimating Ice Time
Copyright Iain Fyffe, 2002


The NHL’s Real-Time Scoring System statistics have shed much light upon hockey statistics since their inception in 1998. One of the most interesting stats that is now finally tracked is player ice time. We now have actual, official numbers for how much time each player spends on the ice. This is fine and good for stats from the past few seasons, but what about all those seasons before 1998? This essay outlines a method for estimating ice times for all players since the 1967-68 season. Unfortunately, the data needed for this method is not available for seasons before the Great Expansion.

The method is based on the idea that the number of goals a player is on the ice for, relative to the total goals his team is involved in, should give a good indication of the proportion of total time that a player is on the ice. Of course, it's not really that simple. Individual players affect the rate at which goals are scored, both for and against. Goals are also scored at a much higher rate when a team is on the power-play. This method attempts to account for such factors as much as possible.

To test the accuracy of the method, actual ice time data from the 1998-99 Mighty Ducks, Bruins and Sabres (a sample of 93 players) will be used. The following terms will be used:

TGF - Total goals scored by a player’s team when that player is on the ice
PGF - Total power-play goals among TGF
TGA - Total goals against a player’s team when that player is on the ice
PGA - Total power-play goals among TGA

A starting point is estimating ice time based upon TGF. If a team played 5,000 minutes and scored 250 goals, then each TGF for a player would imply 20 (5,000/250) minutes of ice time. So if a player was on the ice for 100 goals (that is, his TGF stat is 100), this would imply 2,000 minutes of playing time. We can make the same simple estimate using TGA figures as well. Since there is no sound reason to prefer one estimate over the other, the average of the two is used.

The above estimates ignore one very important fact. Goals are scored at a much higher rate on the power-play than at even strength. Power-play goals and power-play time must be considered. Unfortunately, the actual amount of time teams spend on the power-play and short-handed is not readily available, so estimates must be used.

Team power-play and short-handed time can be estimated using the following formulae:

PPMINT = 2 x PPO x [(PPO - PPG) / PPO]
SHMINT = 2 x TSH x [(TSH - PPA) / TSH]

Where PPMINT is estimated team power-play minutes, SHMINT is estimated team short-handed minutes, PPO is power-play opportunities, TSH is times short-handed, PPG is team power-play goals and PPA is team power-play goals against.

To demonstrate the accuracy of these formulae, I selected a random sample of team situations form the 2000-01 NHL season. This season is used because ice time data for players is available broken down by situation. This allows me to calculate a very close apporximation of each team's minutes in each situation. I cannot be 100 percent precise, for two reasons: the time breakdowns are not split out for traded players, and there are not always the same number of players on the ice at all times. I used an average of 4.9 players on the ice for power-plays, and 3.9 for shorthanded situations. The following table shows the team and situation, the "actual" minutes for that situation, and the estimated minutes calculated using the appropriate formula.

 Team  Situation  Actual  Estimate
 Cgy  SH  604  610
 Col  PP  566  612
 Clb  PP  650  667
 Clb  SH  586  541
 Edm  SH  640  622
 Nsh  SH  564  534
 NJ  SH  542  552
 Phi  PP  590  554
 SJ  PP  698  688
 Van  PP  690  702
 MEAN  608  608

The standard error of the estimate is 30 minutes, or approximately 5 percent of the mean. This demonstrates the high degree of accuracy given by the formulae.

Using the above estimates, a team’s time can now be split into two types: power-play and non-power-play, for both goals for and against. By splitting TGF and TGA into power-play (PGF and PGA) and non-power-play components, the accuracy of the ice time estimate is improved. For example, say a team has 500 power-play minutes and 4,500 non-power-play minutes. This team scores 50 power-play goals, and 200 non-power-play goals. For every power-play goal a player is on the ice for, he should have 10 minutes of ice time (500/50). For every non-power-play goal, he should have 22.5 minutes of ice time (4,500/200). So if a player has 150 TGF (remember, this includes PGF) and 50 PGF, his ice time estimate would be 2,750 minutes (500 for power-play, 2,250 for non-power-play).

Introducing this factor improves the accuracy of the method. However, even when we include the adjustment for power-play time, we still have distortion caused by superior and inferior players. For instance, Wayne Gretzky and Mario Lemieux will have their ice times overestimated because they raise the rate their teams score goals when they are on the ice drastically. Conversely, their lower-scoring teammates will have their ice times underestimated, on the whole.
This problem cannot be completely eliminated, but its effects can be limited. By introducing a normalizing factor, we can improve the estimates of all players. This factor represents the average ice time a typical player would have. It serves to draw all players toward the average, thereby reducing the distortion caused by extreme statistics. For the test group, I used the following factors, which were determined to minimize the standard error of the estimate:
Forwards: 14 minutes per game (MPG) for first-, second-, or third-line players, 10 MPG for fourth-line players.

Defence: 22 MPG for the first pair, 18 for the second, 17 for the third.
The determination of who is a first-pair defenceman as opposed to a second-pair defenceman (for instance) can be fairly subjective, but is usually fairly obvious based on inspection.

The ice time estimate is the average of this normalizing factor and the estimates based on goals for and goals against (considering power-play and non-power-play situations), as follows:

MIN = (GFMIN + GAMIN + NORM) / 3

Where MIN is the estimated total minutes, GFMIN is the estimate based on goals for, GAMIN is the estimate based on goals against, and NORM is the estimate based on the normalizing factor.

The following table presents the results of the test group, including the mean actual ice time (MEAN), the standard error of the estimate (STE), and the STE expressed as a percentage of the mean (%).

 Group  MEAN  STE  %
 All players  779  89  11.4
 Forwards  748  80  10.7
 Defencemen  826  92  11.1

The method presented above is a very accurate one for estimating the amount of time players spend on the ice. Having these numbers at our disposal will aid in any analysis of statistical information for players before 1997-98, when actual ice times numbers were first recorded.

4 comments:

  1. This is interesting stuff. Just out of curiosity, if you "may not even agree with it myself anymore", have you come across a better method of what you're trying to do here? How much accuracy is lost when simpler methods are opted for, such as estimating TOI based purely on a player's percentage of a team's goal events or % of even-strength goal events?

    ReplyDelete
    Replies
    1. I have refined the method since this was published, making small improvements by incorporating PP and ES points as well, data which was not available for all post-expansion seasons at the time. I've never published the revised method, but may someday.

      Delete
    2. Is there a "normalizing factor" in your new method -- and if so, does it remain unchanged? I'm curious if you think it's possible to reasonably estimate the ice time of "bottom of the roster" players by some other means. It seems all of the methods I've seen to estimate ice time of players using goal events has this issue with fourth liners.

      Delete
    3. No, the additional sophistication obviated the need for the normalizing factor.

      Delete

Hostgator promo codes