HOCKEY-L Archives

- Hockey-L - The College Hockey Discussion List

Hockey-L@LISTS.MAINE.EDU

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Bob Stagat <[log in to unmask]>
Reply To:
Bob Stagat <[log in to unmask]>
Date:
Mon, 15 Mar 1999 12:25:47 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (93 lines)
Having been properly chastised off-line by Wayne Smith for my intemperate
behavior of last night, I would like to offer my public apology to Mr.
Tuthill for having so rudely defamed him.
 
My rudeness, however, does not alter the fact that Mr. Tuthill's
description of how the KRACH or CHODR algorithms work was absolutely wrong. He
described a Bayesian-like model, in which an assumed a-priori distribution
can significantly affect a model's predictions. Neither KRACH nor CHODR use
any assumed a-priori distribution -- they base their ratings estimates
exclusively on the results of the current season's games, with no built-in
historical biases.
 
In what follows I will attempt to give a general, no-too-technical
description of the KRACH's pedigree and why I claim it has a well established
basis in statistical theory -- one that I believe the RPI and HEAL are
sorely lacking. I might note that I have absolutely no connection to any of
these rating systems. I do not know the people who developed them or who so
kindly compute them for us each week. I only know what I have read in the
FAQs that the guardians of KRACH and CHODR have posted on their websites.
And, of course, I have my own background in mathematics, probability, and
statistics to guide me.
 
The KRACH is a particular implementation of a general methodology known
as logistic (or some people prefer 'logit') analysis. Logistic models are
commonly used by statisticians to describe processes that have a binary
outcome -- only two possible results. A classical example of logistic
analysis is in testing the efficacy of a drug in treating some disease and how it
varies as a function of dosage. In this application the two possible
outcomes are 'gets sick' or 'doesn't get sick' (or, perhaps, 'dies' or
'doesn't die'). Detailed descriptions of the logistic model can be found in any
reasonable book on statistical inference.
 
Another type of binary process that can be modeled using a logistic model
is the outcome of some contest, in which the two possible outcomes for
any contestant is 'wins' or 'doesn't win.' KRACH is not the first to use a
logistic model for this purpose. Both the US Chess Federation and the US
Table Tennis Association use player rating algorithms whose 'probability of
upset' functions are based on a logistic model. The logistic model has
also been used to estimate probabilities in horse races (see "Racetrack
Betting: The Professors' Guide to Strategies" by Peter Asch of Rutgers and
Richard Quandt of Princeton). I'm sure that there have been many other
applications in the world of sports and competition, but these are the few with
which I am particularly familiar.
 
Details of how KRACH ratings are computed can be found at:
www.mscs.dal.ca/~butler/krachexp.htm
What follows is my overly brief summary of that process.
 
The model assumes that there is a parameter that can be associated with
each team (its rating), which is initially undetermined. The KRACH ratings
are directly related to the odds-ratio -- the ratio of two teams' ratings
is assumed to give the odds of either team winning. (Note: If you look in
statistics texts, they often use the log-odds as the adjustable parameters
in their logistic models; log-odds are simply the logarithms of the KRACH
ratings.) If I let p denote the probability of Team 1 winning (so p must
satisfy 0<p<1), then the odds of them winning is p:(1-p) and the odds
ratio is p/(1-p), which is a number between 0 and infinity. KRACH hypothesizes
that this odds ratio is equal to r1/r2, where r1 and r2 are the ratings
of the two teams involved. Straightforwrd algebra then shows that the
probability of Team 1 winning is given by p=r1/(r1+r2).
 
Knowing this, and given the results of all contests played to date, we
can ask, "What would our logistic model predict is the probability that the
observed results are the ones that would actually have occurred?" For any
individual game that probability is rw/(rw+rl), where rw and rl are the
KRACH ratings of the actual winner and loser, respectively. Writing down
such an expression for each individual game, we need only multiply all these
probabilities together to get the total, overall probability for all games
played.
 
The total probability I have just described is a Godawful function of all
the (unknown) KRACH ratings. We now determine those ratings using what is
known as a Maximum Likelihood Estimator. For the individual ratings we
select values that together maximize the overall probability for the actual
outcomes of all games played. That's the messy computational part. But it
is totally unbiased. It knows nothing about what you or I or Ken Butler or
John Whelan or Dick Tuthill or anyone else thinks the rating of any team
should be. All it knows is who beat who this season.
 
So that's it -- my hopefully not-too-verbose explanation of why the KRACH
ratings have a firm statistical basis and a high degree of credibility. I
hope I haven't bored people to tears. In the meantime, I'm still curious
to learn the origins of the RPI. My working hypothesis is that it was
invented by a bunch of NCAA basketball coaches who were sitting around a bar
one night and simply thought that it seemed like a good idea at the time.
(I have a similar hypothesis about the HEAL, but substitute 'Maine high
school coaches" for 'NCAA basketball coaches.')
 
Bob "Trying to keep a civil tongue in my keyboard" Stagat
 
HOCKEY-L is for discussion of college ice hockey;  send information to
[log in to unmask], The College Hockey Information List.

ATOM RSS1 RSS2