Individual: Stats | Heisman | Fantasy    Team: Rank | Rank2 | Summary | Picks | Pick All | Champs    Conf: Rank | Standings | VS. | [?]

Wednesday, August 26, 2009

Ranking high: scientific proof that preseason polls matter

This post will be a bit technical, but bare with me. I have argued before (rather convincingly, I think) that preseason polls are somewhat effective at predicting the eventual national champ.1 This then begs the question--do preseason polls just predict or do they actually influence the final rankings?

Those who argue that preseason and postseason polls are independent say that any correlation between the two shows that pollsters made some good guesses about which teams will be good and which won't. Florida might not finish #1 in 2009, but I can guarantee that they'll finish in the top ten. It's also possible that the relationship is spurious-voters put Notre Dame too high and Utah too low at all times, be it pre, post or mid-season.

Those in the other camp, though, point to the stepwise fashion in which teams move in the polls. It is usually controversial for a team to jump another team that also won that week, and therefore those teams that start on top have an advantage over those that need to jump them. It can also be hard to get noticed if you start outside of the top 25. Consequently, preseason poll results improperly influence the final outcome.

I also think we should not underestimate the importance of the pernicious disease I call Neuheiselitus. Much like Eli Manning or Mall Cop, people can't seem to figure out that Rick Neuheisel isn't actually good at coaching football. It often takes a while for pundits to realize that some talented teams with high expectations aren't any good. On the other end of the spectrum is Applewhiteocious-just because they couldn't find a helmet that didn't cover his eyes didn't mean Major Applewhite wasn't twice the quarterback that Chris Simms could ever hope to be, and yet he had a hard time staying on the field. This is alternatively called Flutiecoccus and is now plaguing Hyundai and Canadian bacon.

Whose right? To answer that question, I used regression to estimate the importance of different factors-win/loss record, strength of schedule, national prestige, and, of course, preseason ranking. Basically, by taking into account other factors that can influence a team's final ranking, I can isolate the unique influence of preseason polls on postseason results.

I've used data from 1994 to 2008 from AP Poll Archive. I first used regression to predict the final rankings using only the win/loss records and the strength of schedule. In the blue box, you see the R-squared is .78-this means that just using these four factors we can very accurately predict the final rankings. The green box shows the strength of the effects. Each win moves a team up the polls (closer to number 1) by 1.6 on average and a loss moves you down 3.4. That should seem about right. A tougher schedule also moves a team up in the polls-no surprise there.
Next, I add prestige factors-total wins for the program, national champions and whether or not they are in a BCS conference. Of these, only being in a BCS conference really matters (if the number below P>|t| is above .05 the factor is not significant). On average, a team in a BCS conference will finish about 5 spots higher than another team not in a BCS conference with the same record and strength of schedule. Figures.
Next, I add general measures of the team's performance. the PerfRating is based on margin of victory and EloRating just on win/loss record (like those used for the BCS computer rankings). The EloRating is not significant because it measures the same thing as the win/loss record and strength of schedule, but the PerfRating is important. Finally, I add the preseason rankings. You will first, notice that the P>|t| value is below .05, which means that preseason polls have a real influence on postseason polls. In other words, the results in the final rankings would be different if we didn't do preseason polling. But before we get too excited, it is important to also look at the coefficient (=.0539). This means that two equal teams with the same performance and backgrounds would finish one spot in the final poll if they started 20 sports apart. So, while preseason polls do inappropriately influence final rankings, the effect is not large. Being in a BCS conference, though, still bumps up 4 spots.
One group, though does seem to benefit more than others. The table below lists the biggest benefactors of preseason polling. The Pred. is where the team should have finished, but these teams all finished a few spots ahead of where they should. They also have some other commonalities--they are major programs from BCS conferences, started between 2 and 6 and finished between 9 and 18. Classic cases of Neuheiselitus
In summary, preseason polls do influence final results in a way they are not supposed to, but not enough to really worry about. It will help you more if you are a disappointing major program that was supposed to have a shot at a national championship. And teams in BCS conferences can lose one more game than an otherwise equal non-BCS team and still finish higher in the polls. The non-BCS conspiracy theorists have been right all along.

3 Comments:

Matt said...

Interesting work, nice job. Add an interaction term between BCSConf and PreRank? That should nail that group of teams you are looking at with the large residuals.

Scott Albrecht said...

Matt, I tried it out, but I ran into a little problem. When I include the interaction term, it is positive and significant, meaning that preseason ranking is even more important for BCS conference teams, but it flips the sign for preseason rankings, so that it now has a negative coefficient. One interpretation is that preseason rankings are good for BCS teams but bad for non BCS teams . . . or we could just recognize that with so few non BCS teams being ranked in the pre and post season, the interaction term is giving us problems with multicollinearity. I think the more reasonable interpretation is the latter and that the model is no longer valid with the interaction term.

Matt said...

Makes sense. Was the preseason rankings coefficient still significant? Maybe it just had not much effect at all. I agree that hardly any non-BCS teams are ranked in both places so it makes it tough to work with.

In any case, your takeaway seems correct -- the preseason rankings affect things in an irritating and unjust way, but not a huge way. Still, it's aggravating to see teams keep their ranking position as long as they don't lose. That's the lamest way of ranking teams but no one has the guts to shuffle things around week to week. Oh well.

Post a Comment