>>Again, YOU ARE MISSING THE POINT> Top 20 are all ABOUT the same quality, > Sorry, but I can't be convinced that #20 and #1 are even close to the same > quality. That was not my point anyway. My point (or at least this part of it) is that upsets happen; with KRACH we can even say with what probability we expect them. #20 UAA has a KRACH of 159.4; #1 Wisconsin's KRACH is 863.4, more than five times as high. (All of this is only including games before the national tournament began, since these are the results the committee had to work with.) So the gap between UAA and Wisconsin is about as big as between #44 Quinnipiac and UAA. In either case, we would expect an upset once every six or seven games. For the game in question, UNH went in with a KRACH of 503.9, less than three times Niagara's 175.9. So the probability of Niagara winning was around 26%. Unlikely, but far from inconceivable. (Besides which, as people have pointed out, UNH was on something of a skid. On the same scale, their performance in their last 16 games would only earn them a 250.1, while Niagara's last 16 "criterion rating" was 211.8; so in those terms, Niagara was only slightly less likely to win than UNH.) >> regardlesss.... look at the POINT. I will bow out at this point. The >> POINT is not about niagara, but about the rating systems. > I don't know what makes Tony think that I've missed the POINT. The POINT > is that we now have one piece of empirical data that confirms that Niagara > should have been in the top eight of any rating system. The point that we're all trying to get through to you is that THERE ARE 931 OTHER PIECES OF DATA ALSO TO BE TAKEN INTO CONSIDERATION, in the form of a whole season's games between Division I teams, INCLUDING THE 27 (D1) GAMES PLAYED BY NIAGARA IN THE REGULAR SEASON. [Capitals to increase the chance that Terry notices my statement of the point, surrounded as it is by tangential paragraphs.] > If anyone is going > to try to sell a new rating system, it should reflect actual outcomes. > Maybe I'm just being too logical about the verification and validation > aspects of any mathematical model. No rating system or selection committee can predict with complete certainty the outcomes of games that haven't yet been played, but in fact the whole idea of KRACH is to find a set of ratings which model the results on which their based as accurately as possible. That is, as I mentioned before, the ratio of two teams' KRACH ratings gives the proportional probability of each one winning a game between them. If teams A, B, and C have ratings of 900, 300 and 100, respectively, a game between A and B is expected to be won by A 3/4 of the time and B 1/4 of the time; between B and C, the probability is 3/4 that B will win, and 1/4 that C will win; when A plays C, they should win 9 times out of 10. So if A defeats B, the probability that outcome was 3/4. If A defeats B and then A defeats C, the probability for that sequence was 3/4 times 9/10; if A defeats B and C, and then C defeats B, the probability is 3/4 times 9/10 times 1/4, etc. For a given set of games, we can take any set of ratings and multiply all the probabilities with which they predict the actual outcomes, to get the overall probability that exactly that set of outcomes should have occurred. If we calculate this overall probability for various sets of ratings, we find that it takes on its largest value when the ratings are those defined by KRACH. This is known as "maximum likelihood estimation". Of course, this is a maximum subject to the assumption of the model, that the odds of teams beating each other behave in a proportional way, but the point is that predictive power is built into the KRACH rating system. Now, part of the idea behind also including record in recent games, against TUCs, head-to-head, and vs common opponents as selection criteria is to allow for the fact that certain teams can end the season on hot streaks, play better against strong opposition, or match up well against certain other teams, and to favor those teams in selecting and seeding the tournament field. Returning to the question of how well the rating system predicts this year's tournament games, of the eight games in the regionals, five of them were won by the team winning the pairwise comparisons, with the three results that went against the comparisons being Niagara over UNH, Michigan over Colgate, and BC over Wisconsin. (Those were also, not surprisingly, the three games won by the lower seeds.) The Bradley-Terry modified pairwise comparisons actually did a better job of predicting the results, getting six of the eight games right (Michigan wins the modified comparison with Colgate). But this is fairly meaningless anyway, since eight games do not provide a statistically significant sample. It would be a worthwhile project (calling Craig Powers...) to calculate PWCs in the standard and modified systems using the pre-tournament results of each season going back to 1992 (when the 11-game regional format was introduced) and see how many games were won by the team winning each kind of pairwise comparison (and perhaps also the team with the higher KRACH or the higher RPI). The sample size (once this season's tournament is over) would be 99 games; not great, but something to go on. Of course, there would be various other effects interfering, such as higher-seeded teams (which were at least for some seasons based on the standard PWCs) having advantages such as last line change, a day's rest, or the opportunity to play in their own region. John Whelan, Cornell '91 [log in to unmask] http://www.amurgsval.org/joe/ HOCKEY-L is for discussion of college ice hockey; send information to [log in to unmask], The College Hockey Information List.