Using Big Data to Win Your March Madness Pool Page 3: Page 3

Big data is used to forecast and interpret situations of many types. Could it even help you win your March Madness pool?
(Page 3 of 3)

Teams can and do get hot. But hot teams cool off, and in any statistical sample, there is a tendency to regress to the mean.

"Favor teams that won their regular season conference championship; they win more consistently in the Big Dance. Obviously, it's not the only factor to consider, but we've found it to be statistically significant even in the presence of other stuff. On the flip side, teams that got their league's automatic bid by winning their conference tournament don't win more frequently, so don't worry about that," said Jay Coleman, Professor of Operations Management & Quantitative Methods at the University of North Florida’s Coggin College of Business.

"There's a satisfying and unsurprising statistics/analytics story there: the bigger sample we get from the regular season is more predictive/representative of teams’ strengths than the relatively tiny sample of games played in conference tournaments. No duh . . . right?" he added.

I'm not sure it's a no-duh observation. The human mind isn't particularly good at calculating probabilities, so even obvious statistical insights can run counter to evidence-free conventional wisdom (i.e., go with who's hot now).

5. Don't Try to Reinvent the Wheel (Trust Nate Silver)

Before you go trying to build a perfect bracket-picking model, why not look at the other models out there, study them for any obvious deficiencies and then simply improve on those?

Nate Silver doesn't have the reputation for picking March Madness winners that he does for zeroing in on Electoral College votes. Nevertheless, his models are proven to work, and whenever they're based on weak data, Silver will let you know.

The FiveThirtyEight forecast builds on existing computer models (Sagrin, Pomeroy, etc.) and adds in other factors, such as injury and geography.

This year, judging from FiveThirtyEight projections (and a range of similar forecasting tools), the number one seeds are weaker than in previous years. Moreover, when you consider statistical factors that are usually based on luck more than anything else — such as won-loss record in close games — Florida stands out as thevalue pick in this tournament. (Well, unless you live in Gainesville.)

However, statistically speaking, there is better than a 30 percent chance that the one, two or three seed from the Midwest region (Louisville, Duke or Michigan State) will win the National Championship.

With any statistical forecasting tool, it's important to remember that they measure a slice in time. If a key factor has changed, such as a serious injury to a top player, you're no longer comparing apples to apples, even with a large data sample.

"For instance, last year Syracuse was doing really well and most of the models said they were 1 or 2. However, an important player was injured, and it was fairly clear they weren't going to go that far. In such cases, I'd rather run the model and then override its results," said Davidson College Math Professor Tim Chartier.

Chartier and his students have developed forecasting software that in past years helped them place in the top 97 percent of the 4.6 million brackets submitted to ESPN.

Chartier’s new free course March MATHness on Udemy, an online learning marketplace, shows how to use three popular sports ranking methods — two of which are used by the Bowl Championship Series — to create your own mathematically-produced brackets for March Madness and pick which teams will prevail in the NCAA Finals.

Another of Chartier's key pieces of advice is to not rate all games as equal. If you are assigning weight to a victory (or loss), some victories should count more than others. You should, for instance, be suspicious of early season wins. Those could have been compiled by a team that looks a lot different from the one on the floor today.

"If a team is beating big teams at this point in the season and you are treating them as, say, two wins, then, they get elevated in the ratings. This is how we find teams that can otherwise be overlooked," Chartier said.

However, sometimes his software is used to weight heavily those things that can sink a bracket, and sometimes it pays off. "One year I had a student reward the ability to have winning streaks," he said. "She had the only bracket that we produced with math that recognized that Baylor was going to be in the Final Four."

Page 3 of 3

Previous Page
1 2 3

Tags: big data

0 Comments (click to add your comment)
Comment and Contribute


(Maximum characters: 1200). You have characters left.