1. 程式人生 > >A Data Scientist’s Guide to Fermi Estimation

A Data Scientist’s Guide to Fermi Estimation

How do you balance the tradeoff between a fast versus correct answer for decision making? Will a cheap, approximate answer suffice, or do we need to spend the time and money to get as close as possible to an exact answer? These are the two extremes of data-driven decision making. Today, we will talk about what Fermi estimation is, and how it can help you achieve faster, cheaper answers for more effective decision making.

Named after Enrico Fermi’s method of finding passable approximations in order to move on in a calculation, Fermi estimation is a widely applicable and potentially powerful tool for quantitative reasoning — particularly for physics and engineering.

Fermi estimation (at its best) trades small amounts of accuracy for saving large amounts of time. Scientists will often look for Fermi estimates before moving onto more time consuming and expensive methods of collecting data.

One famous example of a Fermi problem was when Enrico Fermi and a team of scientists were tasked with estimating the yield of yield of the Trinity bomb. The rest of the team took a few days to precisely measure 20 megatons, while Fermi, using just a few assumptions and back of the napkin math, was able to estimate 10 megatons just a few minutes after the explosion. 50% off may seem like a lot, but when we consider how large these numbers are, estimates within an order of magnitude can be very useful. He was able to provide the team with value with a close answer in a comparatively short amount of time.

How does it work?

As with the law of large numbers, individual deviations of measurements in a sample distribution tend to cancel out. As the number of measurements increases, the true mean is revealed to a greater degree of confidence. In Fermi estimation, this rule holds true as well — along a chain of reasoning, overestimates and underestimates of individual assumptions tend to balance each other out.

In greater detail, multiplying terms corresponds to adding the logarithms of the individual terms. For a Fermi calculation n steps long with uniform standard deviations of σ on the logarithmic scale, the overall estimate will have a standard deviation of

For example: a 9-step Fermi estimation with each step underestimated or overestimated by a factor of 2 (the uniform σ), the standard deviation of the final answer is

It really is that simple. This means you can expect that the final estimate will be 1/8 to 8 times the true value. This is within one order of magnitude — much better than the worst case 2⁹ = 512 (2.71 orders of magnitude). The shorter the chain and more accurate the individual estimations, the more accurate the final product.

So, how does this relate to data science?

How to apply Fermi Estimation to Data Science

As a data scientist, estimations can be extremely important… and exceedingly difficult — unknown unknowns can pop up at anytime, and new data can invalidate previous assumptions. Furthermore, testing in production can be very different from an isolated model building environment.

How can we use Fermi estimation to better predict how long each step of our project will take? Consider the typical data science project flow: