Individual: Stats | Heisman | Fantasy    Team: Rank | Rank2 | Summary | Picks | Pick All | Champs    Conf: Rank | Standings | VS. | [?]
Showing posts with label predictions. Show all posts
Showing posts with label predictions. Show all posts

Tuesday, September 2, 2008

My Statistical Crystal Ball

(Open your own statistical crystal ball here; internet explorer only)

It’s difficult to know what the future has in store based on the first weekend, but with a little statistical maneuvering, we can make some meaningful predictions.

Using data from the last couple of years and regression analysis, I’ve developed a formula to help us predict how many games a team will win. Using the points scored and the yardage in the game, and adjustments for the strength of the opponent and the conference in which the team plays (team’s in tough conferences will win fewer games than an equal team in a lesser conference), we can get a pretty good guess at how many games a team will win.

Above are calculations for a couple of Big East schools and the attached excel file (you can only view it in internet explorer, sorry) can be used to make the same or any other calculations. Three quick notes: first, the only adjustment for strength of the schedule is conference. If you know that a particular team has an unusually difficult out-of-conference schedule, subtract a little from the expected win total, and vice-versa. Also, statistical predictions have a hard time with extremes, and so the values will appear hedged—even Alabama's performance can't score it an estimated win total that is much above 10. Finally, this calculates the number of wins in a 12 game regular season. Conference championship and bowl games are extra.

Sunday, July 20, 2008

Big 12 South, 2007(8 P)review

I think its important, before we get lost in the hype about certain teams this year, to look back at what these same teams did with their talent last year because, on average, more than 75% of the team is the same as last year. I'm using two measures that I developed to measure performance and reputation over the course of a season. Today, I'm focusing on the Big XII South.

My big questions for the Big 12:
1) Is Tech ready for a breakout year?
2) Who will win the Red River Shootout?
3) Is Oklahoma State on the way up or down?
4) Who will win the Big 12 South?


1) Texas Tech spent most of the season down in the pack in the Big 12 South. After losing the shootout to Oklahoma State, Leach asked for more from the defense and, generally it delivered. With Graham and Crabtree putting up ridiculous numbers, Tech became a real force and arguably one of the better teams in the country. If Leach (the best offensive mind in college football?) can change the status quo in Lubbock on the defensive side of the ball, Tech can realize some pretty high expectations.



2) OU started the season in 2007, behind the ridiculous start of Bradford, in high form which it never again realized. If Mr. Melton is right and Sam is regressing toward a mean, OU could be in for a mediocre season. On the other hand, while UT finally showed some signs of life against Arizona State after a whipping by rival A&M, Colt is not getting any better and super recruiter Mack Brown has not been incredibly successful without the incredible hulk under center. For now, I'll go with the Sooners.


3) (Performance is the solid line and Reputation the dotted line). Oklahoma State? Over-Rated! Oklahoma State never achieved that high of a level last season despite touting "the best offense ever." I doubt expect anything more this next season. With money and flash, they are a popular dark horse, but popular opinion last year and this year continues to rank them in the middle of the pack where they belong. Don't get lost in the hype.

4) Who will win it--dare I say Texas Tech? No, I daren't. Tech will be very successful this season and win a lot of games, but I think Leach needs at least one more year to instill an attitude of expectations that produces the consistent play necessary to win a conference half-championship. This year, like every year, the Big 12 South is OU's to lose.

Sunday, November 11, 2007

A Methodology of the Matrix

I've described aspects of the Matrix as it has evolved, but I think its about time that I give it one coherent description for anyone interested.

The Matrix uses three ratings- a general performance rating based on margin of victory (which is used for rankings), a recent performance rating, and a win/loss rating. The general rating and win/loss rating are calculated with a progressive adjustment model derived from the Elo chess rating system. Ratings are adjusted according to the improbability that a given outcome would occur. The model simulates the season a few hundred times, allowing smaller adjustments with each round, until, through automated trial and error, it arrives at the ratings associated with the least improbability.

For both the general performance rating and win/loss rating, the model assumes that a team's performance will vary and the probability of a particular performance level will fall somewhere on the normal curve. The ratings, therefore, theoretically represent the mean. The larger the point margin, the less effect an additional point will have on ratings, so the effect of "running up the score" is minimal. When estimating the improbability of an event, the model barely differentiates an 18 point win and a 40 point win.

The win/loss rating, obviously, uses only wins and losses and ignore the margin of victory. The factor actually has very little effect on the outcome of model, but I have included it for the sake of comprehensiveness. For the most part, close games really are primarily by luck, and so it is best that the model does not overemphasize the winning of the game. Because the model uses a marginal progressive adjustment method, it is able to handle undefeated teams without the problems faced by MLE approaches.

After the general and win/loss ratings have been calculated, a recent performance ratings is calculated using the deviation of a teams margin of victory from the expected margin of victory. Obviously, greater weight is given to more recent games.

The final component of the Matrix are the Navy adjustment factors. Essentially, these factors compare a team's opponent against past opponent in terms of its relative dependency on the pass and run and then adjusts the expected outcome to match any advantages or disadvantages a team may experience in match-ups. For example, if a team has plays terrible pass defense and now has to play Texas Tech, it should be expected to under-perform relative to its general, recent and win/loss performance ratings.

The general performance rating, win/loss rating, recent performance rating, and Navy adjustment factors are then weighted and used to estimate the margin of victory (along with an adjustment for home field advantage). Finally, I use a consistency rating (how predictable a team's performance has been) to estimate the probability of a suggested outcome (of a team winning or covering the spread).

Results:

These results are only relevant for the results before week 11, 2007.

Top 5 overall:
1. Ohio State
2. Oregon
3. West Virginia
4. LSU
5. Missouri

(Note: After the OSU lost and WVU struggled against Louisville, Oregon has taken the top spot and Oklahoma and Kansas have moved into the top 5)

Oklahoma fans might see a problem that Missouri is ranked higher than their own Sooners. This is a good example, though, where the model has punished Oklahoma more for the greater improbability of their loss to Colorado. Because both teams have only one loss and Missouri loss to a better team than Colorado (who just happens to be Oklahoma), Missouri is ranked higher. Oklahoma is 6th and only 2/10's of a point behind the Tigers.

Top 5 Win/Loss
1. Ohio State
2. Kansas
3. Hawaii
4. LSU
4. Oklahoma
4. Arizona State

Obviously, a win/loss rating should give extra kudos to undefeated teams. The three-way tie for 4th is a bit of an anomaly, but here the Sooners have the advantage over Missouri.

Top 5 Consistency
1. Kansas
2. Florida International
3. Utah State
4. Arizona State
5. Ohio State

Two types of teams find themselves among the most consistent. The surprisingly successful teams that just seem to win every week and the really, really bad teams that will always play poorly against D1A competition. I thought it was interesting that Kansas has been the most consistent team this season and they are 9-0 against the spread this year.

The five most unpredictable teams -
1. UCLA
2. Utah
3. Central Michigan
4. Iowa State
5. UNLV

Fitting.

Navy adjustment factor:

You can't produce a ranking from the adjustment factor, but we can guess which teams are going to have a tough match-up this weekend. The team most likely to get unusually lit up through the air this week was, coincidentally, Navy who gave up almost 500 passing yards and 62 points in a winning effort against the 1-7 (now 1-8) Mean Green of North Texas.

Recent Performance:

Again, it doesn't make much sense to rank teams on their recent performances, because it is relative to their general performance, but the hottest team going into this weekend was Iowa State (relative to their performance all season). Unfortunately for Boston College, another very hot team is Clemson - and a cold team is, well, BC.

When dealing with all these factors, I think it is important to consider their relative importance. The Matrix has the power to explain about 65% of the variance of point margins for games involving D1A teams this season. About 61% is explained by the general performance rating alone and the other 4% by the other adjustment factors and ratings. The win/loss rating barely makes an appearance, and is really just included so the model can be comprehensive and "hybrid," which is such a popular term is sports rating these days.

The model is still somewhat fluid as I make minor adjustments to deal with problems as they arise, but these are the general principles on which it is based. I will continue to publish rankings and predictions, and I will add other stats - consistency, recent performance, match-up warnings, unexpected results, etc.

P.S. according to the Matrix, the most unlikely outcome involving two D1A teams was Notre Dame over UNLV and #2 was UNLV over Utah.

Wednesday, November 7, 2007

Week 11 Predictions

See Week 11 Predictions
See Week 11 Rankings

Last week, the Matrix was 5-0 picking winners in the spotlighted games, but 1-4 against the spread (despite being 27-21 against the spread for the week across all games).

Before getting to the games, I made a couple of minor changes to fix some glitches. The big change is the Navy factor. Basically, the formula from last week gave equal weight across all teams for their pass and run efficiencies. Navy, though, scores high in pass efficiency but doesn’t win games that way. So, I’ve added a factor to weight efficiencies based on the relative importance for that team’s offense.

Now the picks.

Game 1. (18) Auburn @ (10) Georgia

Auburn has lost 3 games by a total of 14 points, including a losses to a former #2 (South Florida) and the current #2, but also managed to lose to Mississippi State. Georgia lost one game by 21 to an inferior Tennessee (though definitely not inferior on that night). Georgia has only 1 quality win (at Florida) and a semi-quality win (at Alabama). Auburn’s record looks similar (at Florida and at Arkansas). Now both teams seem to be hot and it should be a solid, defensive game. The SEC can be a little difficult to figure out, and the line-setters and the Matrix seem to be seeing it that way as well. But Georgia needs to cover and then some at home to convince me they deserve to be 10th with two losses.

The Matrix: Georgia by ½ a point

Game 2. (17) USC @ Cal

A month ago this was set to be the big showdown in the PAC 10. Well, it turns out the big showdown already took place last week in Eugene, but this game will still throw a lot of talent on the field. Surprisingly, the two teams are only 36th and 39th in the nation in scoring despite having tons of talent. Cal’s offensive production has fallen every game since scoring 45 against a relatively tough Arizona defense. USC, on the other hand, played Oregon tough two weeks ago, clobbered Notre Dame’s JV team, and beat Oregon State handily last week. Unless Cal turns it around, USC should win in style but, at home, Cal has the talent to compete.

The Matrix: USC by ½ a point

Game 3. Illinois @ (1) Ohio State

My gut tells me that Ohio State better watch out for this game, but I can’t find any statistic to support that inclination. Illinois has lost some steam recently since its big win over Wisconsin. We can be confident that Juice won’t get much done in the air, but, if he can stretch the field a little, just might be able to pound out some points on the ground – against the nation’s second best run defense. Unfortunately for the Fighting Illini, Ohio State will have a much easier time scoring points. If Wells gets going, Ohio State could cover easily.

The Matrix: Ohio State by 20

Game 4. (4) Kansas @ Oklahoma State

Here’s why this game is important – Kansas is still undefeated. If Kansas wins out, it would have a portfolio that includes wins on the road and at a neutral field against top ten teams. That being the case, I think you have to put them in the national championship game, even if that means hopping them over one loss Oregon and LSU. But first it must beat the Cowboys. Kansas proved last week it could score points, and they will need to put up 35+ on Saturday if they want to win.

The Matrix: Kansas by 12

Game 5. Texas Tech @ (14) Texas

Texas Tech is 1st in passing yards and 118th in rushing yards, almost dead last in the nation. This isn’t new turf for Tech, but its still fun to watch. I still hold that Texas is over-rated this season, and I think Tech will show the world I’m right on Saturday. To come to the point, Harrell is a lot better than McCoy and the outcome will reflect that. The Matrix actually rates Tech higher than Texas, but gives the Longhorns a 3 point advantage at home.

The Matrix: Texas by 3

Pick of the Week:

Somehow, Iowa State has ended up on the other end of the Matrix’s pick of the week again. Colorado has a 74% of covering a five point line at Iowa State. We’ll see.

See Week 11 Predictions

Sunday, November 4, 2007

Week 10 Recap and the Field Goal Calculator

With one game left this weekend, I thought it was about time to report on the initial results from the Matrix's first weekend. All in all, things have gone well. The Matrix correctly picked the winner in 38 of 53 games and was 27-21 against the line.

And it could have been better. If you look back at the picks (table) from last week, you might notice that Oregon is listed as the road team against Arizona State. Unless Arizona State has decided to start playing its home games thousands of miles from home, this is a typo. The Matrix gives teams a bonus if they are playing at home, and, in this case, gave Oregon's "at-home" credit to Arizona State. This mistake shifted the prediction, in this case, by about nine points. In other words, if the Matrix would have been fed accurate data, it would have been correct.

Then there is Rice and Texas. After ragging on these two teams last week, they put up a combined 52 points in the 4th quarter to win and cover the spread. Cincinnati put up 31 in the 1st quarter to do the same. And then the Demon Deacons felt the Cavalier curse and missed what would have been the winning field goal in the last seconds.

But I found a more worrying issue when I looked more closely at the results from the Matrix. Against the line, the Matrix was 0-2 when it gave one team an 80% chance or greater of covering. One, which I mentioned before, was Rice and their 20 point comeback. The other was Iowa State, who seems to actually be a much better team these days. The Matrix was most accurate when it gave one team only a small advantage.

The game of the week, in my opinion, lived up to its billing (that I gave it). Navy wins in triple overtime on a failed two point conversion attempt. Historic. I'm a bit biased towards the midshipmen - I love to watch their offense - but how could anyone that isn't a Catholic not jump on the Navy bandwagon after that game.

The Field Goal Calculator

The field goal calculator is my initial attempt to create an adjustable system that can estimate the number of points that a team will get on average if they kick a field goal or go for it on 4th down. The calculations come from an earlier blog and are not, by any means, perfect, but I think it is a good starting point.

Here, I have provided an excel spreadsheet so users can play with it themselves. The spreadsheet has 4 entries and a graph. You can adjust it for the leg strength of the kicker (average = 0), the accuracy of the kicker (average = 0), the average yards per play for that team, and the number of yards the team needs to get a first down. The graph shows how many points that team could expect to get if they went for it or if they kicked the field goal from various points on the field.

Download Field Goal Calculator.xls