Archive | Sports wonk RSS feed for this section

KHL Statistical Power Rankings Explanation

10 Oct

I developed a statistics-based power ranking that will be a weekly feature at  The idea was to come up with a system similar to the BCS ranking for (American) College Football (but less complicated).  Here is the formula and then a part-by-part explanation.

Formula by Team

∑(goal differential per match x opponent points) = RAW

I guess I could write that more formally, but basically here is how it goes.  For each game, I determine the goal differential.  So, If a game is 3-2, then there is a goal differential of 1.  The winning team will get a 1 in the cell for that game.  The losing team will get a 0.

Next, the point differential is multiplied by the number of points the team has in the standings.  Say in the scenario above that each team has 15 points in the standings.  Then the one goal differential is multiplied by 15 and the winning team receives 15 points for that game.  The losing team has 15 multiplied by zero, so teams get no points for the loss.  The totals for all games played are added together for the RAW score.

This means a couple of things.  First, the losing team is not penalized for losing.  Second, the winning team does receive an incentive by beating a team by a larger point margin.  However, just running up the score and not playing defense will not help a team in these rankings, because it is not goals scored, but goal differential.

Overtime and shutout wins are considered indirectly by multiplying these totals by the point standings.  Beating an opponent by the biggest differential who has the highest point standings will give a team the most points for a game.  Beating a lesser opponent is less significant.

The RAW score is adjusted by dividing the number of games played (GP), which gives the “Points Ranking”.

I hope this makes sense and you enjoy the KHL Statistical Power Rankings.  The first edition is here.

More on Grabovski – Do advanced stats say anything about a team scoring goals?

28 Aug

An initial disclaimer:

This piece is for discussion.  Statistical operations can be tricky and there can be a number of ways to do things.  I am not claiming to be right or wrong on anything, yet.  If you have some advice, please provide comment.

Round 2

So, after my article on the Caps picking up Grabovski and me not thinking it was as big of a deal as others were making it, the response was brutal.  I take some credit for that by putting out an unpolished piece.  In the end, I stand by my argument that the idea Grabovski would go from a career 45-50 point scorer to a 60-70 point guy was hyperbole.

Some people discussed how his Corsi and Fenwick ratings, and that Washington had a lot more offensive zone faceoffs than Toronto (which should lead to more chances), would make him an improvement over Ribeiro.  I basically argued that despite the improved advanced stats, it seemed crazy that any one person’s numbers would jump that high; thus, the Caps roster is at a net loss without Ribeiro, add Grabo.

To that end, I wanted to examine this further.  Here is my idea:  the better Corsi, Fenwick and offensive zone faceoffs a team has, under the “Grabovski hypothesis”, should lead to more team goals (he manes his teammates better argument).  If this is true, we should be able to perform a linear regression and see how a variety of statistics effect the number of goals a team scores (goals for).  In other words, I wanted to see what happens when we regress a team’s “goals for” for a season (y-variable) on a set of variables, including those mentioned above (X-set).

Thus, I went to and grabbed team stats for all teams from the 2007-2008 seasons through the last season.  I added all of HA’s data (see legend below) and added some dummy variable, which is common when analyzing panel data.


TOI = Time on ice
GF = Goals For
GA = Goals Against
GF60 = Goals For per 60 minutes of ice time
GA60 = Goals Against per 60 minutes of ice time
GF% = Goals For percentage = 100* GF / (GF + GA)
SF = Shots For
SA = Shots Against
SF60 = Shots For per 60 minutes of ice time
SA60 = Shots Against per 60 minutes of ice time
SF% = Shots For percentage = 100* SF / (SF + SA)
FF = Fenwick For
FA = Fenwick Against
CF = Corsi For
CA = Corsi Against
Sh% = Shooting Percentage
Sv% = Save Percentage
OZFO% = Percentage of face offs that took place in the offensive zone
DZFO% = Percentage of face offs that took place in the defensive zone

Items in red are in the data table, but were not used in the regression so there weren’t correlation issues between the x-variables.

Dummy Variables

east – Eastern Conference (0=No, 1=yes)

west – Western Conference (0=No, 1=yes)

yr** – year dummy for the year the data was taken (0 = not year **, 1 = year**) – one dummy variable for each of the six years


Looking from the 2007-2008 season through the 2012-2013 season, the regression results only showed statistically significant results (at the 0.05 level) for shooting percentage and shots for (see “regressions results with lockout year” below).

I thought maybe the lockout-shortened season last year might have messed with things a bit, so I removed it and ran it again.  The only thing statistically significant again is shooting percentage and shots for.  Fenwick-for and Corsi-for are statistically significant at the 0.1 level, which is usually not accepted.  Let’s say we do accept the stats at this level.  A team would gain 1.7 goals per season for every additional 1,000 Corsi-for, or 1,000 shots directed at the net, or an one goal per season for every 333 additional Fenwick-for or 333 shots directed at the net (excluding blocked shots).


If I did this correctly, then only those old-fashioned statistics of shots on goal and shooting percentage matter how many times a team scores.  Offensive zone faceoff percentage does not matter.  Corsi and Fenwick are not statistically significant.  Even so, Grabovski and his improvement on other players would have to add 1,000 shots directed at the net to gain an additional 1.7 goals per season (or 333 shots not including blocks).

This does not say whether or not Grabovski will be better or worse than Ribeiro.  But, as it stands, Grabovski’s addition to the team based on the advanced stats do not have a statistically significant affect.  What will matter?  If he can get people the puck to score at a high percentage or put a lot more pucks on net, unblocked.  We know he is not an assist guy, so I think it can be deduced that he will not likely raise the shooting percentage for others (give them good chances).  Ribeiro on the other hand is a distributor based on his higher assist numbers throughout his career.

With the regression, as it is, I think my argument stands….the Washington Capitals roster is worse minus Ribeiro, plus Grabovski.  The boys still have to play this out on the ice….

All files and R script are available upon request.


Regression results with lockout year.

Estimate              Std. Error             t value Pr(>|t|)   

(Intercept)          -9.013e+01          2.739e+01           -3.290    0.00123 **

SF                           7.776e-02            7.542e-03            10.310  < 2e-16 ***

SA                           9.466e-03            7.945e-03            1.191     0.23526

FF                           -4.715e-03           8.897e-03            -0.530    0.59690

FA                           -3.381e-03           7.771e-03            -0.435    0.66408

CF                           2.919e-03            4.285e-03            0.681     0.49663

CA                          -7.803e-04           3.323e-03            -0.235    0.81466

Sh.                         1.614e+01           3.087e-01            52.269  < 2e-16 ***

Sv.                          -3.527e-01           2.997e-01            -1.177    0.24093

OZFO.                   7.607e-02            2.656e-01            0.286     0.77492

DZFO.                    -3.457e-01           2.135e-01            -1.619    0.10732

east                       1.101e+00           1.352e+00           0.815     0.41634

west                      1.385e+00           1.438e+00           0.964     0.33663

yr13                       6.808e-01            2.779e+00           0.245     0.80681

yr12                       1.468e-01            1.141e+00           0.129     0.89776

yr11                       1.178e-02            1.138e+00           0.010     0.99176

yr10                       4.663e-01            9.894e-01            0.471     0.63808

yr9                          1.853e-02            8.647e-01            0.021     0.98293

yr8                          NA                          NA                          NA          NA

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Regression results without lockout season.

Estimate              Std. Error             t value Pr(>|t|)

(Intercept)          -1.444e+02          1.295e+01           -11.152   <2e-16 ***

SF                           8.331e-02            3.073e-03            27.110   <2e-16 ***

SA                          -2.951e-03           3.277e-03            -0.900   0.3696

FF                          -6.418e-03           3.604e-03            -1.781   0.0773 .

FA                           3.601e-03            3.134e-03            1.149   0.2525

CF                           2.929e-03            1.721e-03            1.702   0.0911 .

CA                          -1.713e-03           1.354e-03            -1.266   0.2078

Sh.                          1.826e+01           1.418e-01            128.81   <2e-16 ***

Sv.                          6.188e-02            1.359e-01            0.455     0.6497

OZFO.                   -1.117e-01           1.214e-01            -0.920   0.3592

DZFO.                    -6.648e-02           9.268e-02            -0.717   0.4744

east                       -2.084e-01           5.782e-01            -0.360   0.7191

west                      -2.127e-01           6.114e-01            -0.348   0.7285

yr13                       NA                          NA                          NA          NA

yr12                       5.828e-01            4.596e-01            1.268  0.2071

yr11                       4.903e-01            4.583e-01            1.070  0.2866

yr10                       5.869e-01            3.926e-01            1.495  0.1373

yr9                          1.635e-01            3.373e-01            0.485  0.6286

yr8                          NA                          NA                          NA       NA

Semifinal #IIHF Worlds predictions. How does Sweden’s win over Canada shake things up? #Bracketology #MoneyPuck

16 May

From my Bracketology blog post (here), I went three for four on the day.  I picked the first three matches, but missed the Sweden win over Canada.  The Sedin-Sedin-Danielsson line killed it (minus that bad shootout attempt by H. Sedin–yikes).  Patting myself on the back, I missed the prediction on the last game of the day, going all the way to a shootout, in a sudden death shoutout situation.  Moreover, besides the 50/50 split on the US-Russia match, this was the closest statistical matchup–see the previous odds and probabilities here (without any historical adjustments, injuries, etc.)  Now, I couldn’t guess the Canada-Sweden match any better than I could predict that 8-3 blowout of USA over Russia, but at least I wasn’t way off.

Here is my changed bracket, with Sweden in, but I will still take Finland in that game.  If there is a game this year I would pay to be at, it would be Sweden vs. Finland from Stockholm.  It should be a battle of goaltenders, but hopefully a low scoring affair doesn’t mean a lack of offensive action.

Screen shot 2013-05-17 at 12.00.36 AM

My gold medal match prediction stays the same, but will Sweden beat Switzerland?  I am going to guess Switzerland grabs the bronze medal now.  Sweden fought hard to come back against Canada, but have one main line and strong goalkeeping.  I am not sure if that will work against the Swiss–it didn’t work the first time they played.

So, here are the odds for the semifinal round and some comments.  Keep in mind, these are neutral odds based only on math formulations, not calculating in profits as a casino or bookie would.

Likelihood Moneyline      (US) Decimial Odds (EU)
Finland 61.02% -156 1.64
Sweden 38.98% +150 2.50
Swiss 88.45% -733 1.14
US 11.55% +14 8.33

The early odds from has Finland as the underdog.  This must have to do with Sweden having home-ice advantage.  I would bet on Finland for sure in this match.  It is not a big return (listed at +135), but it is the much better bet.  Sweden is most likely to lose based on the Log5 method and the moneyline reads -167 for them.  The bookies basically have swapped my neutral odds above.

As I assumed, after the US blowout of Russia, the oddsmakers at are dismissing Switzerland’s undefeated run.  I too dismiss Switzerland over the US even with the odds in their favor from my guess, but my guess is against the grain of the analysis.  Maybe not as much as the +14 moneyline on the US (the game should be closer than US-Russia), but Switzerland should not be dismissed.  Switzerland’s and Finland’s chances of winning have decreased even though they moved on, but the Swiss should be really favored to win.

The returns are bad on this game.  +105 on the Swiss and -133 for the US isn’t worth wasting your money on.  Gambling tip from a non-gambler: bet Finland…I am 75% right so far and the stats give Finland a 3 in 5 chance to win.  Good luck to all the teams!

Breakdown of @USAHockey demolition over #Russia.

16 May

Toot toot tooting my horn this afternoon.  After running some odds and probability calculation yesterday, the US and Russia teams were 50/50 to win today’s game before the start (see all the odds here).  I correctly predicted in my IIHF Worlds Bracketolgy blog post (here) the USA would win if Gibson got the nod over Bishop in goal.  But, I don’t think anyone predicted an 8-3 drubbing.  I did think bringing Ovechkin on the roster this late was a mistake for the Russian side.  Was it?

Well, Ovechkin scored a goal and also had an assist.  So, two points in the first game was great on paper.  However, that goal was pretty meaningless.  The Russian’s were down 4-1 when he nailed the rear bar behind Gibson and a two goal difference was a close as they would get.  The Americans chased Bryzgalov from the net at the end of the second period.  Kovalchuk, who led the tournament in scoring, finished with zero points.  Medvedev and Radulov were in the tournament’s top ten leaders in scoring prior to this game; both finished with zero points.  Radulov finished -3, partly due to giving up a turnover on the powerplay which led to the US scoring shorthanded for goal number five in the early matchup.  If you look at things this way, Ovechkin was sort of the sole bright spot on a team where their star player’s didn’t show up.  I disagree with this take on things.

First off, credit to the Americans.  Bryzgalov’s GAA was low prior to the game, but he was seeing a low number of shots a game (credit to Russia’s defense).  On the other hand, his save percentage was pretty mediocre.  In the first match the US side sent only 22 shots his way, scoring on three of them.  In this match, similar results.  The big difference was 21 shots in the first TWO periods, scoring on four of those opportunities.

The next problem was making room for Ovechkin.  This caused some line shuffling to get him in the game.  Loktionov was scratched and Tereshenko (-3) was promoted in his place to the top line with Radulov and Kovalchuk.  Anisimov was bumped to the third line to make room for Ovechkin and Kunetsov (-1) got on the third line in Tereshenko’s place.  On the fourth line, Russia had three new faces from the first time they played the US.  The US roster?  Nearly identical to the first time they played.  This without a doubt caused chemistry issues from a game Russia controlled first time around to a game where they were dominated.

Ice times were a major issue for Russia’s leaders because of Ovechkin.  Only Radulov saw an increase in ice time for the game, while Kovalchuk and Medvedev saw a decrease in playing time.  All three of these player’s saw an increase in their third period playing time today in an attempt to make a comeback when the game was still manageable.  In other words, outside of the third period, Radulov saw the same ice time, while Kovalchuk and Medvedev were on the ice two to three less shifts for each of the first two periods.

The results of line member swapping and adding Ovechkin: a combined -12 for these four players and an embarrassing 8-3 loss at the hands of the Americans.

#IIHFWorlds Probability of wins for quarterfinals #MoneyPuck

16 May

I thought I’d put a little math twist on tomorrow’s match-ups and calculate the probability of each team winning.  From here, hopefully I can create some odds.  I will compare them with what you could bet against online after the analysis.

First things first, using the “Log5” method for calculating a team winning or losing (credit to Bill James in Baseball Abstract), this is what you do:

Win probability = (A – A * B) / (A + B – 2 * A * B) — where A represent Team A’s winning percentage and B represents Team B’s winning percentage.

For tomorrow, without running a regression and seeing if prior games, past year’s seeding, strength of schedule, luck, injuries, etc., make a difference, this is what we have:

Finland 77.88%
Russia 50.00%
US 50.00%
Slovakia 22.12%
Swiss 94.78%
Canada 70.59%
Sweden 29.41%
Czech 5.22%

Basically you have the percent chance each team will win their game tomorrow.  The next step is to convert these percentages into odds and then I will convert these into a moneyline.

Finland -355
Russia +/-100
US +/-100
Slovakia +355
Swiss -1900
Canada -245
Sweden +245
Czech +1900

For the explanation of the plus/minus on the moneyline, you can follow the link here:

Moneyline odds are usually considered “American” style odds, so here are the “European” style decimal odds:

Finland 1.28
Russia 2.00
US 2.00
Slovakia 4.55
Swiss 1.05
Canada 1.41
Sweden 3.45
Czech 20.00

Now, these wouldn’t guarantee a profit, because I would need to estimate the betting spread for each team and pass that over my profit margin, which is 8% customarily if I was a bookie.  What makes this fun is one can see how betting agencies set their odds differently from these “even odds” in order to make a profit.

I took a quick look at the lines over at, where the internet says they have the lowest profit margins, meaning they should be the closest to my calculations.  It appears they take performance from past World Championships into play.  I’m not sure how much sense this makes when we see a Switzerland like this year  Maybe with professional club teams, but not here.  That is why a regression analysis would be important to see which things play the biggest role in winning or losing in the playoff round of a World Championship or other country-based format.

Nevertheless, the US is the biggest underdog (based on past matches against Russia).  Nevermind that they had the same record in the tournament and played a close match.  US is +300/4.00 and Russia is -400/1.25.  Seems a little ridiculous to me….but maybe this is where they clean up!

Switzerland is also an underdog against Czech Republic, when the Swiss have clearly been the better team.  They are currently listed at +180/2.80.  My -1900/+1900 clearly needed to be adjusted, but to make the Swiss the underdog seems a little crazy too.

I am dead on with my Canada and Finland odds, so it appears they have raised the probability of Sweden and Slovakia winning in order to meet their profit margin/lower potential payouts.  This was also likely adjusted because of what I believe is their faulty outlook on past games.

Ok–let’s see how this ends up!  Games start in nine hours!

#2013WJC (delayed) #Moneypuck update. Results of pulling your goalie…does it matter?

16 Feb

A quick look at the results of pulling your keeper in the World Junior Championships.  There are some potential problems with this analysis.  First, if you switch goalies because you either leading by a lot or losing by a lot there may not be a reason for the teams to play as hard.  Bench players may also get more time, meaning less skill on the ice, possibly less scoring and defense.  Nevertheless, it’ll be interesting to look at the results.

The first keeper pulled was in game 2: Switzerland vs. Latvia.  The Swiss were up 5 to 2 after the second period and Latvia switched in Punnenovs for Merzlikins.  Switzerland’s offensive performance declined in the 3rd period, putting only 9 shots on goal in the 3rd (17 in the 1st and 13 in the 2nd).  However, the Swiss outscored the Latvian side 2-0 in the final period.  Latvia actually played worse in the 3rd period with the new keeper.

Punnenovs got the start in the final two games and finished with a 5.02 GAA.  Merzlikins finished with a 6.23 GAA.

The U.S. switched goalies after going up big against Germany in their 8-0 win.  Though it is hard to say definitively it had an effect, Gibson lost to a much better Russian side in their next match.

Germany moved away from Subban after there 9-3 loss to Canada.  Cupper started the final three games and lost 8-0, 7-0, and 2-1.

In both the U.S. vs. Russia and U.S. vs. Canada losses, Gibson was pulled in the final minutes to give the Americans an extra skater.  Neither instance led to the equalizer.  It would be interesting to see if more offense was generated when Gibson was out of the net, even though there were no goals.

Finland scored in five seconds after pulling Korpisalo in their 5-4 shootout win over Switzerland.  This goal was made by the extra skater, Markus Granlund, but during a faceoff.  Scoring on a possession in the offensive zone within five seconds makes it difficult to credit the goal to having the extra skater.  Nevertheless, that was the case.

So, it appears that in a tournament setting, that pulling your goalie when you are up to give them a rest in later games could affect them negatively in later games.  Also, generally speaking, pulling your goalie more often than not does not lead to that equalizer goal.  The wisdom is that the man advantage gives a team a better opportunity to score, but the extra goal rarely comes to fruition.


***This article was originally drafted in January.  Since there was an interesting goalie pulling situation in the under 20 tournament for the Hungarian team.  Mark Plekszan started in goal the first game and was chased out.  Hungary lost that first game.  He was replaced in the following game, but got the start again later.  He was again chased from the net; however, he was pulled early enough in the first period that Hungary was able to come back and win that game.  The mixed result here is that pulling him in the tournament probably didn’t help his confidence.   Yet, making an early decision in a tournament to pull your keeper could be beneficial.  Though, it seems if a team decides to make that switch, then they should stick with their decision for the rest of the tournament.  This was played out in the 2013 WJC and some of the Olympic prequalifying tournaments, as the teams that switched goalies the least had the most success.

Making the decision to pull your goalkeeper. #2013WJC #Moneypuck.

27 Dec

As I was tweeting about the USA vs. Germany World Junior Championship games earlier, I incorrectly tweeted that John Gibson was the netminder for the complete game shutout today.  After going over the stat sheets, Phil Housley made the decision to put in Jon Gilles for the final period of play.  Gibson played well, saving 19 of 19 shots in 40 minutes of action.  However, how will missing the final 20 minutes of play affect Gibson in future games?

Being an NFL fan, when teams make the playoffs with regular season games remaining, teams often don’t play some of their starters in order to let the player rest or to prevent injury.  I can think of two instances, Manning with the Colts and Brady with the Patriots, where both were rested after strong regular seasons and they came up short in the playoffs.  Football is played once a week though, maybe twice, so a layoff could lead to three weeks or more without being on the field.  Here, we are talking about missing 1/3 of a game and then playing again the next day.  Maybe the turnaround will keep players fresh…maybe not….

I thought about this issue after watching the Olympic prequalification in Budapest this year.  Team Hungary and Team Holland massacred Team Lithuania and Team Croatia in their first two games.  Hungary pulled their starting keeper because of big leads in both of the games; the Netherlands pulled their keeper part way through the Croatia matchup.  The result: a 7-6 finale between the two teams where each netminder was torched by the other teams.

So, I realize that there isn’t a lot of statistical evidence that the reason for the higher scoring output was because of goalies being rested.  Scores of international games tend to be a little higher then we are used to in professional leagues.  This could be because of a lack of defense, more open play as we see during all-star games – there are many reasons why the keepers let in more goals than usual.

First though, let’s take a step back and look at why starting keepers are pulled.  In order of occurrence (guessing), I would say the following are the reasons:

  1. One team has a slight lead over the other and there is less than 2 minutes left.  Keeper pulled for the extra skater;
  2. Injury replacement;
  3. Bad play…hoping the change creates a spark or saves the keeper from further embarrassment/psychological issues (gun shy);
  4. Lead is so high, that starter is pulled to provide rest and/or prevent injury.

This list could be wrong, but I would say reason 4 is definitely the last reason a keeper is pulled.  I think this odd reason that could lead to some problems.  Keepers may not be as sharp the next game because they didn’t play an entire 60 minutes and probably weren’t tested much during the limited minutes they played.  Goalies tend to be pretty superstitious…being pulled for an uncommon reason could mess up their mojo.

Now let’s predict the future instead of playing “I told you so down the road”.  Only two goalies have been pulled so far in the tournament.  Latvia pulled their keeper against Switzerland today after falling behind 5 to 2 after two periods.  This was after a loss the previous day with the same keeper playing a full game.  Saturday’s starter and subsequent play against Sweden Saturday could be telling.  The guess should be that if the same keeper starts, he would play no worse or else pulling them didn’t do any good.  If the a different keeper starts, then the guess would be he is outplaying his counterpart.

The U.S. pulled their keeper after going up 6-0 over Germany.  Gibson’s first major test will be tomorrow.  Russia was well contested by Slovakia, pulling off an overtime victory with 10 seconds remaining.  Valsilevski saved 32 of 34 shots in nearly 65 minutes of play.  Gibson for the U.S. saved 19 shots in 40 minutes of play.  Shawn Reznik from calls Valsilevski the tourney’s “Best Goalie“.  The U.S. has the groups only shutout in two days.

Tomorrow will be interesting …with a strong offensive and defensive outing by the U.S. and a understated showing by the Russians, if Gibson let’s in a few soft goals, it will make us wonder a bit about Housley’s decision to not let Gibson see the final 7 shots.