London was engulfed by Cholera. It is 1854 and doctors and politicians believe that epidemics like Cholera and Plague are caused by miasma – rotting organic matter that pollute the air. Germs have not yet been discovered. And people thought that these epidemics could not be stopped.
But John Snow knew something. He was a physician who started doing a simple exercise: marking on a map wherever Cholera death occurred. When he found a cluster of 61 deaths in the Soho region, he investigated a bit. And lo! Found that all of them drank water from the same pump at Broad Street. There had been zero deaths in regions in Soho where people were drinking water from other pumps.
Thankfully the guardians of the local parish heard Snow out. And although they were not fully convinced, they removed the handle of the pump. And Cholera evaporated from that part of town.
A map and little data analysis created the whole new field of epidemiology. John Snow ended up saving thousands of people.
“Data is the new oil.” – Clive Humby (Mathematician at Tesco)
But just like unrefined oil doesn’t hold much use, unprocessed data is meaningless. How do you make sense of data? And use it intelligently?
There is a whole field of data scientists and statisticians who massage the data and generate intelligence out of it. A lot of their strategies need math that is beyond most of us. But not all of us have to do what they do. We can channel Pareto and 80/20 the data analysis problem. What’s the minimum we should do to get great insights from data?
W. Edwards Deming revolutionizes Japan with charts
“Made in Japan” holds a premium today. We know that products coming out of Japan are usually of much higher quality. But it wasn’t always so. Right after World War 2, Japan was known to sell products of subpar quality.
So what happened? W. Edwards Deming went to Japan. He went to Japan as a statistical consultant for the allied powers. But changed the way Japanese factories made their products.
And he did it all with a chart. Deming had learnt the idea of control charts from Walter Shewhart of Bell Labs. And he made it popular all over Japan.
What’s a control chart? It’s a chart that helps one control the quality of their goods.
- Chart the data. Gather all the data, and chart it.
- Calculate the mean. The average of all the recent data points. And draw the lower and upper limits – lines beyond which the numbers would be unlikely.
- Investigate outliers. The anomalies in data. Whenever any number would go beyond the limits, investigate why.
With this simple process, Japanese factories constantly went on improving their products. By simply focusing on outliers. And fixing its causes.
Deming’s control charts contributed massively to Japan’s industrial rebirth after 1950. Companies like Toyota adopted his ideas and became world class.
Counter-intuitive statistics
Perhaps the most cited statistical story published in almost all stats books is of Abraham Wald.
Wald was at Columbia University during World War 2. And was part of a team who were given a task to help protect planes from enemy artillery. Trade offs had to be made because if the whole plane was strengthened, it would be very slow. So the team had to give recommendations as to which parts should be fortified.
They started out as physician John Snow had started out a century ago plotting Cholera. By going through all the bullet holes a plane had after it came back from a bombing campaign – and then plotting all the bullet holes on a chart of the plane.
The idea was simple: strengthen the parts which had more bullet holes.
But Abraham Wald stopped them. And asked them to do the opposite. Strengthen the parts that had zero bullet holes.
Wald realized that they were just looking at planes that were coming back. They weren’t looking at the planes that were shot down and never made it back.
Because of Wald, the fuselage and engine compartments were strengthened. Which saved hundreds of pilots during the war!
Wald counter-intuitively realized what Deming’s control charts would have shown right away. Zero bullet holes are an outlier.
Action Summary:
- Investigate the outliers. That’s the key to rapidly improving any process.
- With today’s tools, everyone is drowning in data but not getting any intelligence out of it. You don’t have to do anything fancy. Focus on taking the mean range of the last x data points, and then using that value to find future outliers.
- Recalculate the mean regularly and now you have continuous improvement.