The numbers game: Hockey is going to the nerds
The NHL has had what some are calling its “summer of analytics,” as the league’s teams frantically harvest a small world of online amateurs for quantitative power. It started when Kyle Dubas, who had earned an analytics following as stat-crunching general manager of the OHL’s Sault Ste. Marie Greyhounds, found himself elevated to assistant GM of the Toronto Maple Leafs at the tender age of 28.
As any hockey-analytics dude could tell you, the NHL salary cap means that player payrolls are a fixed fraction of league revenues. As teams’ profits grow, each has more money to spend on non-player frippery—including research and advice. Dubas was given the Leafs’ black credit card and told to build an analytics department.
Other teams moved fast in self-defence. Tyler Dellow, the Toronto litigator who spent a decade criticizing the Edmonton Oilers for various species of illogic on his now-defunct mc79hockey blog, was ushered inside the tent by his frenemy franchise. Poker professional and author Sunny Mehta was named director of analytics in New Jersey. Bay Area scientist-statistician Eric Tulsky, a rising figure in the online world of hockey analytics, was hired by . . . somebody: He won’t say who, and the media hasn’t sniffed it out yet.
Meanwhile, Dubas and the Leafs netted Cam Charron, a witty Yahoo “data journalist” who had been one of the Leafs’ many Internet tormentors. They also bagged formally trained statistician-blogger Rob Pettapiece, and Darryl Metcalf, whose Extra Skater site had been the best-ever place to get statistical summaries from NHL game sheets.
This represents a stunning private capture of ideas and approaches that floated free in the commons for a long time. (Extra Skater has dropped from view, as have a million or so words worth of Dellow tirades.) The quantitative gold rush will be familiar to fans of baseball, where something similar happened more than a decade ago. But why is hockey going Moneyball now?
The Maple Leafs’ difficult 2013-14 season was a public experiment that even the most conservative hockey fan could not ignore. Before Game 1, the analytics crowd jeered when the Leafs bought out truculent, opinionated Belarussian centre Mikhail Grabovski. They howled when the Leafs gave New Jersey winger David Clarkson a $37-million contract swollen with ironclad signing-bonus money. Clarkson was then 29 and was seemingly being paid for the one fluke 30-goal season on his resumé.
Anti-analytics dinosaurs in the media cheered as the Leafs got off to a hot start: By Halloween, they had 10 wins and four losses. Stick that in your pocket protector! But the nerds, along with their media tribune, the Globe and Mail’s James Mirtle, warned that the Leafs’ scoring pace was almost certainly inconsistent with their shots for and against, i.e., the divisive “Corsi” concept that is to hockey analytics what on-base percentage was to baseball’s Moneyball revolution.
The movie ended exactly as the nerds predicted. The Leafs, despite good goaltending, sagged to sixth in the Atlantic division. Clarkson’s contract is regarded as one of the league’s biggest, ugliest millstones. Grabovski, paid $14 million by the Leafs to go away, established himself in Washington as an entertaining minor star. The playoffs went ahead without the Leafs: At the start of the tournament, “Fenwick close” figures (a variant of Corsi) pointed to a Kings-Rangers final, with the Western team strongly favoured.
By now, every fan has heard of Corsi—named, a little misleadingly, for former goalie Jim Corsi, who was the first (that anybody knows of) to start counting shot attempts, including missed and blocked shots. Probably, many have heard of the “Fenwick” variant of Corsi, which leaves out the blocks and is superior for some purposes. (Fenwick is Matt Fenwick, a Flames fan and engineer now based in Edmonton.) These statistics have, perhaps regrettably, become the buzzwords of hockey’s analytics revolution.
Their raw predictive power is impressive, as last year’s playoffs suggest. Corsi and Fenwick correlate closely with time spent in the opposition zone, and with scoring-chance generation. They are more reliable from year to year for an individual player or team than traditional stats. Over just one season, they cannot serve as a catch-all measure of player quality, but they provide a sort of microscope that can be used to look for obvious glitches in a team’s line choices, special teams, or defensive pairings.
That kind of specific problem-solving, done in real time rather than leisurely retrospect, is what the NHL teams who hire analysts are looking for. Dellow, for one, was working on faceoff formations before his hiring, using Corsi-like stats to spot problems and follow up with game-video study. The woeful Oilers, who need any edge they can get, will now own whatever knowledge is forthcoming.
The summer of analytics was thus a setback for the public domain, but others will arise to replace the raptured nerds. Number-crunchers like Sportsnet’s Chris Boyle are still pursuing the elusive Nietzschean dream of transcending Corsi and Fenwick, which unrealistically treat all shot attempts equally, by injecting “shot quality” into the model. Elsewhere, blogger Corey Sznajder, influenced by the ideas of Eric Tulsky, is crowd-funding an ambitious project to record the type and outcome of every “zone entry” for every game in the 2013-14 season.
The industrious Sznajder announced, practically as I was writing that sentence, that he has been hired by an unnamed NHL club, but he still intends to complete and release his zone-entry database. His data could ultimately leach some of the dump-and-chase out of hockey, in much the way that baseball stat-heads gradually eliminated overuse of the bunt.
Baseball’s “sabermetricians,” with their prejudices against “smallball” and base-running nuance, are sometimes accused of having made baseball more boring. Whether this fear is germane to hockey will depend on your subjective preferences. The average hockey stat-head imagines the NHL regular-season game becoming more like the playoffs, with fewer staged fights and marquee hits. He does not like the way some teams waste fourth-line minutes on human pylons. He really wants “defensive defencemen” to prove they are paying their way by reducing shots-against.
The nerds have ascended only the first step of the ladder. Some will fail to fit into the NHL environment; some will prove to have left their best work behind them. The league is not without its own internal analytics traditions; after all, Jim Corsi was a coach with the Sabres when he started counting shots. But the day of the first true analytics GM, the boss armed with opinions and data largely formed outside the game, is a long way off.
NET FACTS
Why goalies don’t improve
The biggest challenge for hockey quants will probably always be goaltending. A goalie’s individual performance is reasonably well expressed by a single number: even-strength save percentage. But since most individuals’ save percentages are crowded between .900 and .950, they are mathematically volatile: It can take years to sort stars from duds. Some things we do know:
Starting back-to-back games—bad idea. Goalies have markedly lower save percentages on zero days’ rest, and Tulsky, Dellow, and others have shown that the effect is almost certainly due to the tired goalie, not the skaters.
Goalies seem to peak early in their careers—no later than age 24 or 25 and, possibly, much earlier. Yes, this is your correspondent’s “goalies don’t improve” bugbear. But it has some support from Tulsky and others in analytics-land. Watch for league GMs to gradually split between buyers and sellers of older-goalie labour.
Teams can get themselves back into more games by pulling goalies early. Though professional statisticians have been saying so for at least 25 years, the NHL ignored them in a case of endemic irrationality. Hall of Famer Patrick Roy, named Colorado head coach, changed that in his rookie season. He started pulling goalies with up to 4:46 left in the third. His prestige brought immediate imitators.