Friday 27 June 2014

Puckerings archive: Playoff Scoring Levels (16 Mar 2001)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on March 16, 2001 and was updated on April 8, 2002.


Playoff Scoring Levels
Copyright Iain Fyffe, 2002


Playoff hockey is often seen as being tight-checking and low-scoring, and there is some truth to that. Playoff hockey does not suffer from the same bullshit that the regular season brings; for instance, goons do not play during the playoffs, and fighting is therefore extremely rare. But the question is, how much lower are scoring levels in the playoffs?

The answer to this question has been sought before, by Klein and Reif (KR). Unfortunately, their analysis is flawed. Their point is valid, but they make a mathematical error. Here is the passage:
“The most pervasive belief surrounding hockey is that it is much tighter than the regular-season variety of the game. “Playoff hockey” is the phrase, and it automatically triggers the image of a close-checking, conservatively played 2-1 or 1-0 game. The truth is, though, that the average playoff game features only about 13 per cent fewer goals that the average regular-season game.” (pp.196-198).

A bit overstated, if you ask me, but the point is valid. There is a belief that playoff games are lower-scoring, and it is indeed true. KR mention that playoff games have about 13 per cent fewer goals than regular-season games, although the numbers they present actually suggest the figure is 12 per cent. How do they calculate this? The same way that everyone calculates this figure: they compare regular-season goals-per-game averages to playoff goals-per-game averages. This is a very common error. This method is too simplistic.

The problem is that not all teams make the playoffs. Therefore, if we include all teams in one group (regular-season), but limit the other group to only some teams (playoffs), bias can occur. I will show KR’s data, with additional data that excludes non-playoff teams from the regular-season goals-per game calculations.

I have limited my analysis to the years used by KR. The table presents the following figures:

RSGPG: The regular-season goals-per-game average, as calculated by KR.
PLGPG: The playoff average, per KR.
PLRat: The ratio of playoff GPGA to regular-season GPGA.
FRGPG: The regular-season goals-per-game average, excluding teams that did not make the playoffs. To be honest, I did not calculate these directly, because I’m too lazy (but this extremely small inaccuracy does not affect my point). I used KR’s numbers for GPGA and the total goals in the NHL for that year to compute total minutes, and then calculated an average minutes per team and used that as my minutes for playoff teams in the regular season. he distortion this would cause is negligible.

FRRat: The same ratio as above, calculated using the Fair GPGA.

 Year  RSGPG  PLGPG  PLRat  FRGPG  FRRat
 1936-37  4.75  3.48  .733  4.62  .753
 1937-38  4.88  3.81  .781  4.89  .779
 1938-39  4.91  3.54  .721  4.98  .711
 1939-40  4.84  4.04  .828  4.78  .839
 1940-41  5.12  4.16  .811  5.03  .827
 1941-42  6.05  5.24  .866  6.02  .870
 1942-43  7.20  5.58  .776  6.95  .803
 1943-44  8.17  5.51  .675  7.44  .741
 1944-45  7.35  4.73  .643  7.35  .643
 1945-46  6.69  6.13  .916  6.56  .934
 1946-47  6.32  4.87  .771  6.06  .804
 1947-48  5.86  5.69  .972  5.72  .995
 1948-49  5.43  4.26  .785  5.28  .807
 1949-50  5.47  4.07  .745  5.09  .800
 1950-51  5.42  3.80  .701  5.20  .731
 1951-52  5.19  3.82  .735  4.89  .781
 1952-53  4.79  5.40  1.128  4.74  1.139
 1953-54  4.80  4.18  .869  4.64  .901
 1954-55  5.04  4.61  1.112  4.86  1.154
 1955-56  5.07  5.57  1.100  5.09  1.094
 1956-57  5.38  5.51  1.024  5.36  1.028
 1957-58  5.60  6.05  1.081  5.60  1.081
 1958-59  5.77  5.91  1.024  5.80  1.019
 1959-60  5.90  5.10  .866  5.65  .903
 1960-61  6.00  4.65  .774  5.85  .795
 1961-62  6.02  5.54  .921  5.87  .944
 1962-63  5.95  5.69  .956  5.63  1.011
 1963-64  5.55  4.79  .863  5.43  .882
 1964-65  5.75  5.06  .880  5.61  .902
 1965-66  6.08  5.30  .881  5.89  .900
 1966-67  5.96  5.25  .880  5.77  .910
 1967-68  5.58  5.34  .957  5.55  .962
 1968-69  5.96  5.56  .933  5.84  .952
 1969-70  5.81  6.00  1.033  5.77  1.040
 1970-71  6.24  5.93  .950  6.37  .931
 1971-72  6.13  6.04  .985  5.98  1.010
 1972-73  6.55  6.27  .957  6.55  .957
 1973-74  6.39  5.68  .889  6.32  .899
 1974-75  6.85  6.07  .887  6.84  .887
 1975-76  6.82  5.65  .828  6.82  .828
 1976-77  6.64  6.24  .939  6.63  .941
 1977-78  6.59  5.67  .860  6.29  .901
 1978-79  7.00  6.02  .860  6.95  .866
 1979-80  7.03  6.51  .926  7.07  .921
 1980-81  7.69  7.77  1.010  7.67  1.013
 1981-82  8.02  6.99  .872  8.05  .868
 1982-83  7.73  7.45  .964  7.67  .971
 1983-84  7.80  6.21  .796  7.76  .800
 1984-85  7.68  7.34  .956  7.65  .959
 1985-86  7.86  6.43  .818  7.81  .823
 1986-87  7.25  6.13  .845  7.17  .855
 Average  6.18  5.44  6.07

There are 51 years here. Of these 51 years, the FRGPG is lower than the RSGPG 40 times, higher 7 times, and the same 4 times. So clearly, there is a degree of distortion in KR’s numbers. The difference is generally minor, but it is still there.

What can we learn from this? Really, only that bad teams tend to play high-scoring games, usually because they allow a lot of goals. When you remove them from the mix, goals-per-game figures drop. But that’s not really the point.

The point is that when you do statistical analysis, you should strive to get it right. It is very easy to lie with statistics, as they say, so it is critical to eliminate flaws in your analysis. It’s ironic, because if KR had performed their analysis the proper way, it would have provided somewhat stronger support for their assertion that playoff scoring isn’t as low as many people think. By my analysis, playoff scoring is 10% lower; by theirs, it’s 12%. So you see, doing the numbers right can get you stronger support for your position.

Reference

Klein, J. and K.E. Reif. The Klein and Reif Hockey Compendium (revised edition). Toronto: McClelland and Stewart, 1987.

Friday 20 June 2014

Puckerings archive: Estimating Ice Time (14 Mar 2001)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on March 14, 2001 and was updated on November 26, 2002.
 

Estimating Ice Time
Copyright Iain Fyffe, 2002


The NHL’s Real-Time Scoring System statistics have shed much light upon hockey statistics since their inception in 1998. One of the most interesting stats that is now finally tracked is player ice time. We now have actual, official numbers for how much time each player spends on the ice. This is fine and good for stats from the past few seasons, but what about all those seasons before 1998? This essay outlines a method for estimating ice times for all players since the 1967-68 season. Unfortunately, the data needed for this method is not available for seasons before the Great Expansion.

The method is based on the idea that the number of goals a player is on the ice for, relative to the total goals his team is involved in, should give a good indication of the proportion of total time that a player is on the ice. Of course, it's not really that simple. Individual players affect the rate at which goals are scored, both for and against. Goals are also scored at a much higher rate when a team is on the power-play. This method attempts to account for such factors as much as possible.

To test the accuracy of the method, actual ice time data from the 1998-99 Mighty Ducks, Bruins and Sabres (a sample of 93 players) will be used. The following terms will be used:

TGF - Total goals scored by a player’s team when that player is on the ice
PGF - Total power-play goals among TGF
TGA - Total goals against a player’s team when that player is on the ice
PGA - Total power-play goals among TGA

A starting point is estimating ice time based upon TGF. If a team played 5,000 minutes and scored 250 goals, then each TGF for a player would imply 20 (5,000/250) minutes of ice time. So if a player was on the ice for 100 goals (that is, his TGF stat is 100), this would imply 2,000 minutes of playing time. We can make the same simple estimate using TGA figures as well. Since there is no sound reason to prefer one estimate over the other, the average of the two is used.

The above estimates ignore one very important fact. Goals are scored at a much higher rate on the power-play than at even strength. Power-play goals and power-play time must be considered. Unfortunately, the actual amount of time teams spend on the power-play and short-handed is not readily available, so estimates must be used.

Team power-play and short-handed time can be estimated using the following formulae:

PPMINT = 2 x PPO x [(PPO - PPG) / PPO]
SHMINT = 2 x TSH x [(TSH - PPA) / TSH]

Where PPMINT is estimated team power-play minutes, SHMINT is estimated team short-handed minutes, PPO is power-play opportunities, TSH is times short-handed, PPG is team power-play goals and PPA is team power-play goals against.

To demonstrate the accuracy of these formulae, I selected a random sample of team situations form the 2000-01 NHL season. This season is used because ice time data for players is available broken down by situation. This allows me to calculate a very close apporximation of each team's minutes in each situation. I cannot be 100 percent precise, for two reasons: the time breakdowns are not split out for traded players, and there are not always the same number of players on the ice at all times. I used an average of 4.9 players on the ice for power-plays, and 3.9 for shorthanded situations. The following table shows the team and situation, the "actual" minutes for that situation, and the estimated minutes calculated using the appropriate formula.

 Team  Situation  Actual  Estimate
 Cgy  SH  604  610
 Col  PP  566  612
 Clb  PP  650  667
 Clb  SH  586  541
 Edm  SH  640  622
 Nsh  SH  564  534
 NJ  SH  542  552
 Phi  PP  590  554
 SJ  PP  698  688
 Van  PP  690  702
 MEAN  608  608

The standard error of the estimate is 30 minutes, or approximately 5 percent of the mean. This demonstrates the high degree of accuracy given by the formulae.

Using the above estimates, a team’s time can now be split into two types: power-play and non-power-play, for both goals for and against. By splitting TGF and TGA into power-play (PGF and PGA) and non-power-play components, the accuracy of the ice time estimate is improved. For example, say a team has 500 power-play minutes and 4,500 non-power-play minutes. This team scores 50 power-play goals, and 200 non-power-play goals. For every power-play goal a player is on the ice for, he should have 10 minutes of ice time (500/50). For every non-power-play goal, he should have 22.5 minutes of ice time (4,500/200). So if a player has 150 TGF (remember, this includes PGF) and 50 PGF, his ice time estimate would be 2,750 minutes (500 for power-play, 2,250 for non-power-play).

Introducing this factor improves the accuracy of the method. However, even when we include the adjustment for power-play time, we still have distortion caused by superior and inferior players. For instance, Wayne Gretzky and Mario Lemieux will have their ice times overestimated because they raise the rate their teams score goals when they are on the ice drastically. Conversely, their lower-scoring teammates will have their ice times underestimated, on the whole.
This problem cannot be completely eliminated, but its effects can be limited. By introducing a normalizing factor, we can improve the estimates of all players. This factor represents the average ice time a typical player would have. It serves to draw all players toward the average, thereby reducing the distortion caused by extreme statistics. For the test group, I used the following factors, which were determined to minimize the standard error of the estimate:
Forwards: 14 minutes per game (MPG) for first-, second-, or third-line players, 10 MPG for fourth-line players.

Defence: 22 MPG for the first pair, 18 for the second, 17 for the third.
The determination of who is a first-pair defenceman as opposed to a second-pair defenceman (for instance) can be fairly subjective, but is usually fairly obvious based on inspection.

The ice time estimate is the average of this normalizing factor and the estimates based on goals for and goals against (considering power-play and non-power-play situations), as follows:

MIN = (GFMIN + GAMIN + NORM) / 3

Where MIN is the estimated total minutes, GFMIN is the estimate based on goals for, GAMIN is the estimate based on goals against, and NORM is the estimate based on the normalizing factor.

The following table presents the results of the test group, including the mean actual ice time (MEAN), the standard error of the estimate (STE), and the STE expressed as a percentage of the mean (%).

 Group  MEAN  STE  %
 All players  779  89  11.4
 Forwards  748  80  10.7
 Defencemen  826  92  11.1

The method presented above is a very accurate one for estimating the amount of time players spend on the ice. Having these numbers at our disposal will aid in any analysis of statistical information for players before 1997-98, when actual ice times numbers were first recorded.

Wednesday 18 June 2014

On His Own Side of the Puck - media mention

My book On His Own Side of the Puck was mentioned in the Marek vs. Wyshynski podcast on June 16. You can listen to the bit in question here. Jeff Marek is a SIHR member and seems to enjoy the book, specifically mentioning the preface, which I agree is probably the best part of the book.

The mention has led to a few people asking me whether I have an ebook edition available. I had previously offered an ebook on Blurb, however I was not happy with how the conversion turned out there so I no longer offer it for sale. I am currently looking at Createspace/Amazon as a means to create a Kindle version, which is also readable on an iPad with the appropriate app. I will let y'all know when this is ready to go.



Monday 2 June 2014

On the Origin of Organized Hockey

My last post was a review of the new book On the Origin of Hockey, which presents a great deal of new evidence about early ice hockey, especially as it was played in England. One of the conclusions you can draw from the book is that giving special status to the first recorded, organized hockey game in Canada (played in Montreal on March 3, 1875) does not make sense, given that there are earlier games played in England that were just as organized and just as much hockey as that one. That is, if the first Montreal game is an example of organized hockey, then there are several earlier matches in England that should be considered organized hockey as well. So I thought I'd take you through the process, to show that this conclusion is well-founded.

We start with the definition of hockey in general, which we take from the Society for International Hockey Research's (SIHR) Report of the Sub-Committee Looking into Claim that Windsor, Nova Scotia, is the Birthplace of Hockey. In 2002 this report concluded that all a game needs to be considered (ice) hockey is that it is played on an ice rink by two opposing teams of skaters who use curved sticks to try to drive a small propellant into or through opposite goals.

This definition was arrived at because the committee writing the report needed terms of reference to determine when something is hockey and when it is not. The committee correctly concluded that there is no compelling evidence to conclude that Windsor is the birthplace of hockey. The primary point presented by proponents of the Windsor claim, for example, was from a work of fiction wherein one fictional character is imagining what the youth of another fictional character might have been like. It is thoroughly unconvincing and is rightly dismissed as evidence that hockey was played at a particular time in a particular place. So in fact the definition of hockey was not really needed, due to the nature of the evidence being examined.

However, the committee did engage in a bit of digression about the first recorded hockey match played in Montreal, which was on March 3, 1875. This might have been relevant had the Windsor claim suggested that hockey began in Nova Scotia later than 1875, in which case the Montreal game would have been evidence against that. But that's not the case, so it seems the committee might have been suggesting that Montreal should be considered the birthplace of hockey, though they explicitly state that is not the intent.

The committee noted that the Montreal game seems to mark the transition of hockey from a pastime to a dedicated sporting endeavour: "The March 3 event, structured as it was, invites a distinction between formal and informal hockey..." This raises the question, of course, of what the committee saw in this match that is not present in earlier reports of hockey. They explained:

"The match appears to be unique. It is the earliest eyewitness account known, at least to this SIHR committee, of a specific game of hockey in a specific place at a specific time, and with a recorded score, between two identified teams."

So the purpose of discussing the Montreal game actually seems to have been to put it forward as the birthplace of formal, or organized, hockey rather than hockey in general. Based on newspaper reports, we know the following details about this first hockey match in Montreal: that it was played at the Victoria Skating Rink in Montreal on March 3, 1875 between two teams of nine men each (the names of which were recorded); that there was a captain for each team; that they used a flat, circular piece of wood instead of a ball; that they tried to drive said object through the opposing goal; that there were spectators; and that captain Creighton's team defeated captain Torrance's team by two goals to nil (but we do not know who scored the goals). Both teams were selected from members of the Victoria Skating Club.

If we combine the criteria for formality with the earlier criteria for hockey presented by the committee, we can derive the committee's implicit definition of formal hockey. Formal hockey would have the following characteristics:
  1. ice rink
  2. players wearing skates
  3. players using curved sticks
  4. trying to drive a small propellant
  5. opposite goals
  6. game played in a specific place at a specific time
  7. recorded score
  8. two identified, opposing teams
According to the report's own words, the 1875 Montreal game was the earliest to meet these criteria at the time the report was written, in 2002. It is therefore entirely possible that in the 12 years since then, additional references would have been found of games that match them. The great amount of digitization of old newspapers that has occurred in the past few years only increases the ease at which such references can be found.

On the Origin of Hockey produces several candidate games, that may be considered "organized hockey" using these criteria. Great credit must be given to the authors Carl Giden, Patrick Houda and Jean-Patrice Martel for unearthing and presenting the reports of these games. Let's examine a few.

Moor Park, Hertfordshire - February 2, 1871
I mentioned this match in the review. We know it was hockey, we know the teams, we know the score, we even know the lineups and the goal-scorers. We know about as much about this match as we do the first Montreal game. Specific reference is made to the ice, and the fact that it is called hockey necessarily entails sticks and a propellant, since all descriptions of hockey from that time period and before involve such things. The goals are also specifically mentioned, and the two seven-man teams are listed. A brief goal-by-goal account is provided, and a 10-minute intermission is also mentioned. Unfortunately it does not specifically refer to skates, so in theory someone could object to it on that basis. In the book, the authors present a convincing argument that the skates are implicit, however we do have to recognize that they are not specifically mentioned. Moreover, skates were also not specifically mentioned in the reports of the March 3, 1875 game in Montreal, so if the lack of a specific mention is a problem for Moor Park, then it's a problem for Montreal as well.

Elsham, Lincolnshire - January 6, 1871
On this date, two eight-man teams played to a draw after nearly 90 minutes of hockey. The two identified sides were Elsham and Brigg, and the newspaper reports make it clear that this match involved an ice rink, players wearing skates, using sticks and trying to drive an object through opposite goals. We know that the game was planned ahead of time (Colonel Astley had invited the men from Brigg to come play a match), and was played in a particular place at a particular time. We do not know the precise score, but we do know for certain that a score was kept because the match was declared a draw. All of the implicit criteria for formal hockey seem to be represented in this game.

In fact, these two teams later played a rematch, so it's clear that having a decision on the score was important to the players involved. The second match was played on February 4, 1871 and this time Elsham walked over Brigg by a score of eight to nil. Brigg surrendered 15 minutes before time was up, so we know that there was a specific time that was to be allowed for the game. The ice was said to be in excellent condition for the game.

Bluntisham-cum-Earith, Huntingdonshire - January 4, 1871
This was a big week for early organized hockey, with matches only two days apart, though over 150 kilometres in distance from each other. In this case the teams were 20-a-side, one representing Bluntisham & Earith (the "Bury Fen team") and the other from Over & Swavesey. The former side scored a victory two goals to nil. Once again all of the criteria for formal hockey as indicated by SIHR seem to be met in this match. We even know the names of the team captains.

Spetchley, Worcestershire - December 28, 1870
This is the second match mentioned in the review, and similar to the first we know the (10-man) teams, the score, the lineups and the goal-scorers. In one of the reports describing the game, it is under the heading "skating" so in this case (unlike Moor Park above), we know for certain that there were skates involved.

So the winter of 1870/71 was a busy time for early organized hockey in England. Unfortunately for the sporting fans of that country, it did not seem to really catch on, at least not the way it did in Canada some few years later. But there is an even older candidate for the earliest known game of organized ice hockey, over 13 years before this busy winter.

Swavesey, Cambridgeshire - February 5, 1857
On this date, teams from Over and Swavesey met in a match of "bandy", almost 24 years before they would band together to play against Bluntisham & Earith as noted above. At this time, of course, there is no way to differentiate between hockey and bandy when played on the ice, since the rules of each that would allow us to tell them apart had not been standardized yet. Bandy was simply a form of hockey, as the authors of the book argue in some detail. All of the criteria appear to be met in this game as well. Swavesey won the match, though the exact score was not reported. So this is not quite as good an example as some of the 1870/71 matches, but it is worth considering.

Conclusion

Not all of these games are precise matches for the apparent SIHR criteria to be considered organized hockey. But it is abundantly clear that the level of organization of the Montreal match of March 3, 1875 was  not unique for its time. There were formal games of ice hockey played in England at least four years before that time, and perhaps as much as 18 years before.

What is unique about Montreal is what happened in the years after that first game, as the sport slowly grew in popularity, and interest really began to take off with the first Montreal Winter Carnival tournament in 1883. But assigning special value to the 1875 game is not supportable, given that we now have evidence of a number of very similar matches played previously in England. Canada certainly made hockey into the great winter game, but it was built upon a foundation that first arrived from England.
Hostgator promo codes