1. 程式人生 > >The Ultimate Guide to Filling Out Your Bracket for March Madness

The Ultimate Guide to Filling Out Your Bracket for March Madness

This post is the first time I’m publicly revealing the in-depth method I use to fill out my NCAA basketball tournament bracket. Below, I share all my own resources for you to dominate your office pool, including the bracket I’ll be using this year. Let’s get right into it.

  • The pre-filled bracket I’ll be using
    this year, optimized perfectly for the best chance of winning your pool (“Optimal Bracket” sheet)
  • A blank bracket for you to copy and fill in that dynamically 1) predicts the number of points your bracket will earn, 2) calculates how similar your bracket is to the crowd, 3) gauges your bracket’s risk/reward level, and 4) compares your bracket to optimal strategy (“Your Bracket” sheet)
  • Complete rankings for each team based on win probability, crowd distribution, value differential, and optimal strategy (“Rankings” sheet)
  • All the nitty gritty formulas and data behind my bracket
  • Settings allowing you to adapt all of the above to your pool’s point structure (“Point Structure” sheet)
The “
Optimal Bracket
” I’ll be using as my own this year

The bottom line? Pick Villanova to win it all against Virginia, with Gonzaga and Duke rounding out the Final Four, and Cincinnati, UNC, Purdue, and Kansas filling out the Elite Eight. The math this year produces a fairly chalky bracket, with Gonzaga being the only team in the Elite 8 that is not seeded 1 or 2. If you’re in a larger pool, consider picking Cincy over Virginia or Michigan State over Duke. Consider dark horses like West Virginia, Kentucky, Houston, or Seton Hall, and potential Cinderella teams Butler or Texas.

Read on to learn why you should pick those eight teams to go far. For instructions on how to use the spreadsheet to build your own bracket, jump to the “How To Build Your Bracket” section toward the end of this post.

Behind the Madness

The NCAA March Madness basketball tournament is the most thrilling sporting event in the world. The Super Bowl, the Olympics, and the World Cup get more attention, but nowhere can you find such a large tournament filled with buzzer beaters and Cinderella stories.

Exciting and accessible, with a fair amount of randomness, March Madness is the perfect candidate for an office pool. I’ve written previously about my affection for office pools, such as when I used 200 experts and Reddit’s comment ranking algorithm to win my office NFL pick’em pool or when I outsmarted a FiveThirtyEight NFL predictions algorithm. Now, for the first time publicly, I’m revealing the method I’ve used to stay competitive in my March Madness office pool.

From rooting for the home team to betting like a hedge fund manager

As a teenager growing up in North Carolina, the strategy for my early brackets was to pick teams from North Carolina to win it all. This wasn’t a horrible strategy, given the historical success teams from North Carolina have had in the tournament.

In college, I combed through each expert pick on CBSSports.com, crowdsourcing their “wisdom” to my bracket. This predictably led to very chalky brackets filled with top seeded teams. This strategy happened to work quite well in 2008, when all #1 seeds reached the Final Four, but was quite boring in practice.

A few years ago, I fell in love with an idea espoused by Chris Wilson in this Slate article that advocates picking winners like a hedge fund manager:

You’re […] very unlikely to win if you [choose the crowd favorites]. Even if you get the last few games right for the big points, a lot of other people will, too. At least one of them will probably be luckier than you. […]
Don’t think about guessing the most games correctly. Instead, think about finding “bargains” in the bracket where collective wisdom runs askance of more objective measurements. Exploiting games where your fellow bracketologists are likely to guess wrong — even if the odds of that happening are still against you — will give you the best shot at jetting ahead of the pack. An NCAA bracket, then, is more like a long-shot stock than a game; the odds of winning may be low, but the big pot makes the gamble worth it — if you know how to maximize your investment.
The “contrarian” strategy I’m suggesting here isn’t new; correctly choosing upsets has always given pool jockeys a major boost. What’s changed in the past few years is our ability to value the risk and rewards of a given bet and to decide whether it’s worth it. This bracket-picking strategy isn’t so different from the way Wall Street became obsessed with modeling risk, as Wired has chronicled. The key is having access to two data sets: the wisdom-of-the-crowds data from the national bracket and a table of more objective stats. By comparing the two, you’ll be able to assess whether you’re getting bang for your buck when you throw your lot in with an underdog team.

For his wisdom-of-the-crowds data, Chris uses ESPN’s “who picked whom” feature. This fantastic resource shows what percentage of ESPN users picked each team to reach each round of the tournament.

Chris compares these data with the statistical probability that each team will win the tournament, as measured by Ken Pomeroy, a widely respected statistic evaluator of the strength of each team.

Last year, Chris recommended picking Gonzaga, who, according to ESPN, were the most undervalued team. They were picked to win the tournament by 6.9 percent of ESPN users. Ken Pomeroy gave them a 20.5 percent chance of winning the tournament, meaning they were undervalued by the crowd by a whopping 13.6 percentage points.

For the past few years, I’ve applied this strategy to every game of the tournament, not just for picking the winner. I calculated the difference between A) the probability that a team would make it to a particular round and B) the percentage of ESPN users who picked the team to make it to that round. I then added this difference in value to the team’s win probability and subjected the two components to multipliers that gave double weight to how much a team was over- or under-valued.

This strategy had worked pretty well. In both 2016 and 2017 I correctly picked a team that made it to the championship game (Villanova in 2016 and Gonzaga in 2017). But, this year, I decided to delve deeper into the madness and build a system around optimal strategy.

Enter Optimal Strategy

Earlier this winter, I stumbled across an old Medium post by Robby Greer appropriately titled Optimizing Your 2016 March Madness Bracket. In it, he makes a similar point as Chris made above. Picking either the same as everyone else or differently from everyone else are both poor strategies for winning a bracket pool; the optimal strategy is somewhere in the middle.

He describes the exact function for finding that optimal strategy:

The contribution of a pick towards your chance of winning is a function of the odds that pick is correct and the odds that the rest of the pool isn’t making the same pick.
The value of a bracket is the sum of the values of all its picks, and the bracket that maximizes your chance of winning is the one with the highest value.
[…]
The key to understanding this model is knowing that it does not predict the most likely winners, but rather, the most valuable winners.

I quickly realized that this could supercharge my simplistic model I had been using with Robby’s insights. So what next? Well, build a spreadsheet, of course.

Into the Madness: Data, Formulas, and Brackets

To create the optimal strategy in spreadsheet form, I first needed data. Like Robby (but unlike Chris), I collected my objective probability data from FiveThirtyEight.

Round-by-round probabilities from FiveThirtyEight (exact data used found in .csv download at bottom of page)

They use 6 different computer rankings (including Ken Pomeroy’s, which Chris uses) as well as 2 human-generated rankings. Helpfully, FiveThirtyEight’s predictions provides the probability of each individual team to reach each individual round of the tournament. I easily exported these data into the “FiveThirtyEight Data” sheet.

The second data set I needed was the crowd data. This would tell me which teams were most likely to show up on competing brackets. Unlike Robby, who compiled data from CBS using a python script, I just copied and pasted ESPN’s “who picked whom” feature into the “ESPN Data” sheet (I made minor formatting adjustments to support future vlookups).

To finish setup, I created a “Point Structure” sheet to calculate expected point values of each bracket. I use ESPN’s Tournament Challenge 10–20–40–80–160–320 style scoring as a default, which mirrors most pools that use the derivative 1–2–4–8–16–32 style scoring. This can be changed to any point structure that allocates a set number of points per game by round. It does not (yet) support upset bonuses or other more complex variations.

To build a sheet that could serve for my brackets, I had to get creative. Luckily, I didn’t have to do much of this myself. I found this fantastic downloadable bracket template from PLEXKITS that fit my need perfectly. The template is quite well-designed and supports data validation, allowing you to click and choose a team from a dropdown for each game, much like an online bracket selector. Find this blank bracket on the “Your Bracket” sheet.

Creating My Optimal Bracket

With my blank bracket ready, I needed to fill it out according to optimal strategy:

In layman’s terms, the added value of a particular pick is calculated by identifying the percentage of the crowd who did not pick that team, multiplying it by its probability of progressing past that round, and then multiplying it by the points available for that round.

For example, let’s imagine a first round matchup between a #1 seed and a #16 seed. FiveThirtyEight tells us that the #1 seed has a 97 percent chance of winning. Assume that 95 percent of the brackets on ESPN have the #1 seed winning the game. This means that the crowd is slightly undervaluing the #1 seed, with 5 percent hoping for an overdue upset. We multiply that 5 percent by the 97 percent chance that they will be wrong, equaling 4.85 percent. Finally, we multiply that 4.85 percent by the number of points available for this game, 10 in our case, giving us 0.485 points of value. This means that by picking the #1 seed here, we can, on average, expect to make 0.485 points off that pick relative to the crowd.

0.485 = (1–0.95) * 0.97 * 1

0.485 points isn’t much, especially when you consider the total possible 1,920 points. But, when all of this extra value gets added up across all 63 games, it can give you the edge you need to pull out a win.

Consider a more stark example, like Gonzaga from 2017. FiveThirtyEight gave Gonzaga a 13.85 percent chance of winning the entire tournament, but only 7.30 percent of brackets on ESPN picked Gonzaga to win it all. That difference of 6.55 percent meant a lot of value left on the table for me to pick up. Multiply 92.7 percent of those who didn’t choose Gonzaga by its 13.85 percent probability meant 12.84 percentage points of value. When multiplied by the championship game points of 320, that gives us 41.08 points of expected value we would get by picking Gonzaga relative to the crowd.

41.08 = (1–0.073) * 0.1385 * 320

And that’s just the championship game. Adding up the hidden value across all of Gonzaga’s games, if they won them all, I could expect 106.87 points more than the average bracket, before subtracting other hidden value not chosen in favor of Gonzaga.

=(1-VLOOKUP($A2,’ESPN Data’!$A:$M,13,FALSE))*VLOOKUP($A2,’FiveThirtyEight Data’!$A:$K,11,FALSE)*’Point Structure’!$B$7

We can see from this sheet that the four highest value picks to win the tournament this year from each division, as calculated from optimal strategy, are Villanova (East; 105.48 expected added value), followed by Virginia (South; 90.53 points), Duke (Midwest; 78.99 points), and Gonzaga (60.34 points).

I used this sheet to fill out my Optimal Bracket, starting by selecting Villanova to win all their games, then Virginia, and down the list until the entire bracket had been filled out. It’s very important to fill in a bracket “backwards” by starting with the expected champion. The Final Four and National Championship games, under standard scoring, typically account for 65 percent of your points if you choose them correctly. More often than not, since you’re unlikely to get all of them correctly, even one or two correct picks will still account for 50 percent or more of your final points.

The Optimal Bracket is what I’ll be using as my own this year. The math produces a fairly chalky bracket, with Gonzaga being the only team in the Elite 8 that is not seeded 1 or 2. If you’re in a larger pool, consider having Cincy beat Virginia or Michigan State defeat Duke. Consider dark horses like West Virginia, Kentucky, Houston, or Seton Hall, and potential Cinderella teams Butler or Texas.

Note that I filled out the Optimal Bracket using the overall ranking of the total expected added value for each team (column H). A more accurate method might be to go round by round and fill in the bracket only accounting for points by teams in rounds previous to that round (columns J-N). I experimented with this but found the resulting bracket to be quite high risk, with more upsets than I was willing to stomach (#16 Penn beats #1 Kansas?!) to use as my final bracket. Find this “optimal round by round” bracket on ESPN.

Projecting Point Totals and Added Value

Once I had my bracket filled out, I created two more sheets, one to calculate the total number of points I could expect this bracket to earn and another to calculate the added value expected versus the crowd for each game. This helped me evaluate which teams I could expect to account for most of my points and set the groundwork to compare my picks across different models.

I slightly modified the bracket design to include a summary box in the top center of the bracket. In the center of the summary box, I pulled in the projected total points from the first sheet above.

This year, I can expect the Optimal Bracket to earn ~891 points. If that holds and this year is anything similar to last year, that would place the Optimal Bracket in about the 80th percentile of all brackets on ESPN, in the top 5 million contestants.

That might not sound very good, but it could vary widely depending on how well Villanova does, or if they falter, how popular the winning team is with the crowd. Some years, a very popular team wins, so even though they were overrated by the crowd relative to their probability of winning, they still pay off for the crowd. Teams with large fanbases and a history of NCAA tournament success like UNC, Michigan State, Kentucky, and Duke are overvalued, but perform well (this year Duke is the only one of those four that is undervalued, and only barely).

Comparing to Other Brackets

I felt it important to be able to compare the Optimal Bracket to other brackets to assess performance. As such, I manually created three other pre-filled brackets:

Here’s how each of those brackets is expected to perform this year:

  • ESPN’s People’s Bracket: 872 expected points
  • FiveThirtyEight’s Bracket: 893 expected points
  • Chalk Bracket: 867 expected points
  • (My) Optimal Bracket: 891 expected points

Remember that the Optimal Bracket is built to be the best combination of high value picks that offer the best chance to be different than other brackets, but without sacrificing too much on pick quality. On average, FiveThirtyEight’s Bracket will always be projected to earn the most points, but it will often be too similar to other brackets in your pool to give you the winning edge. This year, FiveThirtyEight’s bracket is 91 percent similar to the crowd bracket, but the Optimal Bracket is only 69 percent similar to the crowd. It’s significantly different without sacrificing many projected points.

How to Build Your Bracket

To build your bracket, you’ll want to first make a copy of my comprehensive spreadsheet that I’ve been referencing throughout the post. Either navigate to File>Make a copy or click this link to automatically be prompted to create a copy in your Google Drive (Google account required). The spreadsheet can also be downloaded as an Excel spreadsheet (File>Download as>Microsoft Excel (.xlsx)), although I can’t guarantee that all the links and formatting will work.

Once you have a version that you can edit, you’ll need to decide how you want to fill out your bracket. If you already have teams in mind, or are simply ready to jump right in, navigate to the “Your Bracket” sheet. Make your selections by selecting a team from the dropdown in each cell for each game.

If you want to use data to determine your picks — assuming you’re not just copying one of the four pre-filled brackets (Optimal Bracket, FiveThirtyEight’s Bracket, Chalk Bracket, ESPN’s People’s Bracket) — use those brackets as reference or study the “Rankings” sheet for a summary of each team’s championship win probability, championship win crowd distribution, value differential (the difference between the win probability and the crowd), and optimal strategy. The value differential column will give you a sense of who’s under- and over-valued by the crowd. This sheet also populates your picks to let you compare them to the other brackets, as well as identify how many points you’re looking to earn from each team.

When making picks on Your Bracket, I highly recommend that you start with the team you think will win it all and work backwards in a sort of inside-out fashion. This will keep you focused on the teams that you’ll be relying on for most of your points (your Final Four teams can often make up 50 percent or more of your final point total).

If you don’t find that intuitive, just make sure to start with the First Four games at the bottom of the sheet and then methodically work your way through the bracket. Be mindful of the red text in the middle that will update as you enter your picks to show you how many picks you’ve made, but be careful to ensure that you don’t have any validation errors — unlike online bracket sites, changing a winner in an early round on this spreadsheet won’t automatically clear out that winner from subsequent rounds and may not reflect projections accurately if you don’t fix these errors. These errors are identified by a red triangle in the top right of a cell.

I’d also advise you not to copy and paste cells, and instead, to use the picker for each game. Not using the picker may lead to errors.

Projected Total Points

The summary box in the top center of your bracket will show you the projected total points based on FiveThirtyEight probabilities, multiplied by the points allocated for each game of each round.

The points system is defaulted to the ESPN points system, that allots a set number of points for each correct pick in round: 10–20–40–80–160–320. Points can be changed on the “Point Structure” sheet. Note that it does not support upset bonuses at this time.

Similarity to Crowd

The “Similarity to Crowd” metric indicates how similar your bracket is to ESPN’s People’s Bracket by comparing the number of points you hope to earn from the same team for each game. Note that this does not simply compare how many raw picks are the same. You’ll notice that differences in later rounds will dramatically drop this similarity calculation, since later rounds count more in the default points system. The higher this number is, the further to the right you are Robby’s bell curve diagram referenced earlier. Anything above ~70 percent starts becoming pretty chalky, or at least herd-y.

Risk/Reward Level

The Risk/Reward Level metric indicates how far off you are from the FiveThirtyEight Bracket’s projected point total, suggesting how many underdogs you’re picking. If you’re off by more than 5 percent of the total possible points (>96 points in the default 1,920 point system or >9.6 points in a 1–2–4–8–16–32 system), I’ve classified that as a “High” risk/reward rating. If you’re only slightly different from FiveThirtyEight (between 2.5 percent and 5 percent), you’ll receive a “Moderate” risk/reward rating. If you’re <2.5 percent off FiveThirtyEight’s projected point total, you’ll receive a “Low” risk/reward rating.

As a rule of thumb, if you’re trying to win a pool with fewer than 15 people, a good strategy would be to build a bracket with low risk and reward. 15–50 people requires a moderate level of risk/reward to have a good chance at winning. Any pool with over 50 people is a crapshoot, so your only chance of winning is to build a bracket with a high level of risk and reward and hope for some upsets to swing in your favor.

Similarity to Optimal Strategy

The “Similarity to Optimal Strategy” metric indicates how similar your bracket is to the Optimal Bracket (that I’ll be using) by comparing the number of points you hope to earn from the same team for each game. Like the aforementioned “Similarity to Crowd” metric, this does not simply compare raw number of matching picks but rather weights the matches based on expected points.

The lower this percentage, the more value you’re leaving on the table that the crowd is offering you. I recommend your bracket be at least 60 percent similar to the Optimal Bracket (unless you’re in my pool, of course!).

Other Tips & Tricks

If your pool consists of people who you know to vary from the national average bracket, consider playing that to your advantage. For example, if your pool is geographically similar and your area has a local team who’s in the tournament, consider betting against the local team to reap the rewards should they falter early — most of your competition will fall by the wayside with them.

Similarly, consider avoiding competitors’ alma maters.

However, if your alma mater is in the tournament, always give them the benefit of the doubt, especially if they’re not a perennial contender. You don’t want to be the University of South Carolina alum in 2017, the Syracuse student in 2016, or the VCU staffer in 2011 who doubted your team when they shocked the country and made it to the Final Four.

Before You Go

Enjoyed this post? Give it a few ?, share it with friends, and follow me.

If you liked this post, you may also like my other posts about sports and data: