Identifying the Cognitive Biases Prevalent in Data Science

by June 2, 2020

Data Science

Every technology enthusiast is aware of how data science techniques have enabled integration of different kinds of data into providing a deeper understanding of the world and functioning of its elements. From dealing with insights for better information-based actions to tackling with questions of future, from predicting lesser traffic routes to synchronizing our online food delivery, data science is everywhere. However, the wide range of applications does not spare it from being prone to biases. These biases could be anything that can distort our ability to draw conclusions impartially and objectively. They can either be intentional or accidental.

Because these biases can affect our power of judgment, it is important to know what they are by identifying them. Besides, algorithms cannot make their own decisions in these cases (at least as of now) so the responsibility falls on us. Out of all the biases, cognitive bias can lead us to make poor decisions ourselves due to subconscious irrational thinking patterns in the system. Since these particular types of biases are a result of our evolutionary process, we tend to systematic judgment errors in situations that demand to make fast decisions in critical time but are limited by information processing capabilities of the human brain. Let us study some of them in detail.

Escalation of Commitment: Also called as Sunk Cost Fallacy. It refers to the tendency of human beings to continue investing resources to a project even when there are no returns whatsoever or costs outweigh benefits.

Hot hand fallacy: This happens when experts are continuing using a particular type of model because it had given the best outcomes in the past without considering experimenting with other models available.

Band Wagon Effect: It is a type of bias that occurs during building a model, where we are overpowered by the impulse to choose a particular model or adopt a methodology just because we hear, see it already been adopted by others. Here we don’t even bother to carry our own evaluation process. We rush towards the popular terms and algorithms without the knowledge of the constraints and associated costs.

Survivorship bias: This arises when data experts or scientists only scan the available data or the ones that can be successful under the specified criteria without analyzing the larger situation. This worsens the situation as we have an incomplete set of data based on which one cannot make a foolproof actionable decision nor draw insights considering all scenarios.

Blind Spot Bias: Under this, we fail to see the impact of biases on our judgment and often tend to find faults when others are prone to having a biased thought.

Anchoring: It is the most common bias exploited by the e-commerce industry to their advantage when we give too much attention to information that is discovered or provided first and make decisions for the future based on the first observation.

False casualty: Also referred to as clustering illusion. This happens because, as humans, we are trained to look for patterns in an abstract or random event, even when they don’t exist. We tend to related pointers in a bizarre way just to arrive at some conclusion, even if it may not sound sensible.

Observation Bias: Commonly known as confirmation bias, it revolves around the belief that we are to seek for patterns or insights which we are looking for or prefer. As a result, we focus only on cherry-picking places where it is expected to produce good results, or where it is very convenient to observe. In this process, we end up ignoring the possibility of discovering previously unknown information or data set that can be holding value for future applications.

Availability Bias: It is a cognitive shortcut that results in over-reliance on the events or data that we can immediately think of or ones that are readily available to us without exploring alternate options or data pools that can accelerate the performance and functioning of our data model.

The knowledge Gap: Although it is a lesser spoken bias, we might have faced it at some point in time. This takes place when we tend to assume that others share the same knowledge and comprehension level as us or tend to undermine ourselves when we meet a know-it-all. This hampers the understanding capability of the team or a model when a new technique is introduced to explain the organization, thus resulting in more confusion.

GroupThink: This takes place when a group of people ends up agreeing on a decision that is impractical just to maintain the group harmony. As a result in this view, the group ends up outweighing the rational decision process.

The golden Hammer: It is a bias where users opt for using familiar tools, even when they can be suboptimal for a task. This occurs when the users are not keen on testing out all the available options due to ignorance or over-reliance.

In an age when data is the biggest asset of an organization, we try to over analyze everything try to make sense of it. There are always some or other new forms of myths that will make us prone to illogical actions and biases when interpreting data. So, we have to take the necessary steps if we want future technologies to be impartial and propel us to new heights.