Climber Worldwide
Missende data: vertel het hele verhaal met Qlik

Missing data? Survive Survivorship Bias with Qlik

How come some airplanes don’t return from the battlefield? Are the success stories of Bill Gates, Jeff Bezos and Mark Zuckerberg the best learning experiences? And how could people in 1987  think that cats were more likely to survive if they fell from a higher floor? All these questions have one factor in common: they suffer from “survivorship bias”.


If you work a lot with data, this might be a familiar term. Survivorship bias is the phenomenon in which results (or survivors) of a process are treated disproportionately. Incomplete data sets, lack of context or incorrect interpretation of data is often the basis of this misconception. If you understand why survivorship bias occurs and you recognize the effect, it will help you better understand your data and make your analyzes more reliable and valid. In recent history we find numerous examples of this phenomenon, it has affected scientists, entrepreneurs, and researchers, among others.


In the book “The Black Swan: The Impact of the Highly Improbable”, Nassin Taleb writes: “The cemetery of failed restaurants is very silent.” But focusing only on success and not looking into the fails will make you miss out on the full scope of your data and not really find understanding of how your processes actually function.

Success stories of entrepreneurs are often used as examples of how things should be done, but in addition to those few success stories, there are a multitude of entrepreneurs who don’t make it. Bill Gates (Microsoft), Jeff Bezos (Amazon) and Mark Zuckerberg (Facebook) are indeed successful in their businesses, but only have one side of the story to tell: how they made it and achieved their success. Many others who may have taken the exact same steps, have the exact same talent and also have shown 100% ambition have failed to make it – and their story is perhaps even more interesting. They can tell you what happened and what caused them to fail. These stories often contain wisdom from which we can deduce why things go wrong, why we fail. Focusing only on the “survivors” will stop you from getting the overall view and finding the flaws in your processes.

“The cemetery of failed restaurants is very silent.” – Nassin Taleb


Another example of missing the big picture arose in 1987: a group of scientists investigated the likelihood that cats would survive a fall from a certain floor. The researchers based their conclusions on data obtained from veterinary clinics. These data were highly remarkable: the researchers noted that the higher the fall, the greater the chance of the cats survival. In fact, 100% of the cats that had fallen from the sixth floor or higher survived their fall. According to the researchers, this was possible because the cats achieved the maximum fall speed during such a fall, relaxed and then prepared for landing, resulting in a better chance of survival.

The Straight Dope Newspaper disproved this theory 10 years later. In this case there is a definite problem with survivorship bias: the researchers only found data from cats that actually had been treated at veterinary clinics. As there was no information in their data of cats that had fallen from higher floors, the researchers assumed that these cats survived their falls unscathed. However, the circumstance was of course the opposite: these cats died immediately as a result of their fall and were therefore never treated at the veterinary clinics. Resulting in them not being registered and never being part of the data-set.


It is 1943: large parts of Europe are occupied by German troops. The allies are trying to get through the enemy’s defense system using airplanes with bombs, but without further success – many planes are shot down and lost. The Center for Naval Analyzes starts looking for a way to reinforce the bombers. To ensure that the aircraft still can take off, the entire machine can’t be reinforced with an extra layer: it’s necessary to choose which parts should have additional armor installed. While the experts from the Center for Naval Analyzes note where the returning planes are most affected, the Statistical Research Group (SRG) of Columbia University is called in.

It’s Abraham Wald, who fled to the U.S in 1938 during the upcoming of the German troops, who comes up with an unexpected conclusion – reinforce the planes where the machines aren’t hit. Wald comes to this finding by stating that planes returning are hit in non-fatal spots: they can return despite damage. The planes hit in other places apparently don’t make it, and that’s why, according to Wald, it’s better to apply armor to these parts of the plane The advice is followed and thanks to the statistical approach of the problem by Wald, the allies gain ground.

“The extra armor belonged not on the part of the plane that could survive a lot of bullets, but to the part of the plane that couldn’t.”  – Abraham Wald


The cognitive engine of Qlik will help you prevent survival bias. In the image above, all types of Hole Location are selected (green), except “No Holes” (light gray). Qlik clearly shows which selection options in Plane and Status are still available (white) and which are not (dark gray). This selection in Hole Location shows that all airplanes with the status “Shot Down” fall outside the dataset. In other words: airplanes with the shown damage return and this damage proves therefore not fatal. Qlik ensures that you don’t miss any data: by using different colors it becomes very clear what is and what isn’t part of the (selected) data-set. This way you won’t overlook anything during your analysis!

Writer: Ronan Berendsen – BI Consultant Climber

Mangel, M., & Samaniego, F. J. (1984). Abraham Wald’s work on aircraft survivability.
Wald, A. (1980). A Reprint of’A Method of Estimating Plane Vulnerability Based on Damage of Survivors (No. CRC-432).

Published 2020-02-12

News archive

Qlik Connect Orlando 2024 – A Legend’s Take

Qlik Connect Orlando 2024 – A Legend’s Take

We’re bringing you the latest insights and highlights from the Qlik Connect event in Orlando. AI was certainly the hot topic at Qlik Connect this year, including announcements on the latest products Qlik Answers and Qlik Talend Cloud.

>> Read more
Your journey to AI & Analytics success

Your journey to AI & Analytics success

Ready to reap the benefits of AI in your business? Join us on the 16th of July in London where we’ll embark on a journey to AI and Analytics success. Meet experts from Qlik and Climber and learn how Qlik’s AI capabilities can take you to from the AI hype to reality.

>> Sign me up!
What’s New in Qlik Cloud – May 2024

What’s New in Qlik Cloud – May 2024

This month’s updates include enhancements to the Amazon connectors, usability and support of Qlik Cloud functionality, and more details around writing inline load statements. This will be a great aid to any novice Qlik script developers. Check out all the news in Qlik Cloud in this blog post to stay ahead!

>> Read more