Application of Bayes’ Theorem to Big Data

5 min readAug 16, 2019

Bayes’ Theorem, in its basic form, is an intuitive process that we use every day. The theorem states that if we have an initial belief — when we get new information, we have a new, updated belief. Put simply, this theorem does not seem profound. However, the concept is used in machine learning, AI: Neural Networks, genetics, bioinformatics, quantum mechanics, finance, and any other field that uses big data. This prevalent theory dates back to the European Age of Enlightenment when Thomas Bayes, a minister and scientific enthusiast, developed it.

Histogram depicting a normal distribution

Discovery

Thomas Bayes was a member of the Royal Society of London and primarily known for publishing a scientific paper defending Newton’s work on calculus. Bayes had studied theology at the University of Edinburgh and became interested in inverse probability — the probability of unobserved variables — which is now understood as probability distributions. Probability itself, as a loose concept, had existed as early as the 8th century CE (1), but was developed into its own branch of mathematics when Abraham de Moivre formalized the rules of cause and effect in his Doctrine of Chances (1718). De Moivre posed the problem of calculating the chance of pulling four aces from a deck of cards. Bayes, in turn, considered the reverse — if four aces were pulled, what was the chance the deck was stacked in favor of said event. He devised an experiment which was just as psychological as it was mathematical. The experiment began with a subject (subject A) sitting with their back to a table. A second subject (subject B) would toss a ball and mark where it landed on the table. Subject A would toss the ball backwards multiple times in an attempt to have it land at the same spot. After each toss, subject B would give information about the ball’s proximity to the original spot. By knowing where the ball had landed relative to the initial ball, subject A could update their prior belief about where the initial ball had landed after every new throw. Bayes concluded from this experiment that, given enough information, subject A could not precisely predict where the initial ball had landed, but could become increasingly more accurate up to a certain distance. Of course, there are many variables not considered in this experiment, but the concept holds true in a normal distribution.

Thomas Bayes never published his findings in life, but he did keep detailed notes on this and similar experiments which were published by his friend, Richard Price, after his death. The theorem is named after Bayes because he was the first to posit the idea. The famous formula (fig. 1) that we know today was not even formalized by Bayes, but by Pierre-Simon Laplace 20 years after Bayes’s death. Laplace was attempting to understand uncertainties of the universe to explain topics such as gender birth rates, celestial observations, and prison sentencing. He expanded on Bayes work, as well as other’s views, and issued his Analytical Theory of Probabilities (Théorie analytique des probabilités) in 1812 which became the most influential book of mathematical probability until the end of the 19th century. In a show of how useful Laplace’s work was, Alexis Bouvard accurately predicted the masses of Jupiter and Saturn within 1% of error compared to modern models using Bayes’ formula.

Simple Application

Today, Bayes’ Theorem is applied in everyday situations. Let’s say someone wakes up with a runny nose and headache — probability says, 85% of the time, someone exhibiting these symptoms has the flu. This may lead to the idea that the probability of having the flu is 85%, however, additional information states that only 3% of the population will get the flu during a given year. Also, the other 15% of the time these symptoms appear, the person is simply having an allergic reaction. The probability of having the flu given these symptoms may be calculated using Bayes’ Theorem (fig. 2).

Variables may be filled with the updated data. The probability of having these symptoms and having the flu ( P ( symptoms|flu ) ) is 85% (or .85). The probability that a person will get the flu ( P ( flu ) ) is 3% (.03). The probability of having these symptoms without the flu ( P ( symptoms ) ) is 15% (.15). The completed formula is now:

The resultant chances that the person has the flu given the symptoms is 17%. While this is a basic example, it shows how the belief of having the flu can change with new information.

Bayes’ Theorem in Industry

Big data applications for Bayes’ Theorem can be seen within many industries from biotech to engineering to robotics. Google utilizes it in their self-driving cars (2) to update the car’s decision making processes. The cars can predict when a person may cross the road, another car will stop suddenly, or a collision is about to occur given enough data. Geneticists use Bayes’ Theorem to determine the cause of mutations to genetic sequences given environmental variables. By using Bayes’ Theorem and enough data, many complex predictions can be made within a limited range — like where the ball landed in Bayes’ original experiment.

Links and Citations:

The American Statistician. Vol 65, 2011 — issue 4 https://www.tandfonline.com/doi/abs/10.1198/tas.2011.10191
Why Bayes Rules https://www.scientificamerican.com/article/why-bayes-rules/

Looking for more information about Macromoltek, Inc? Visit our website at www.Macromoltek.com
Interested in molecular simulations, biological art, or learning more about molecules? Subscribe to our Twitter and Instagram!

Application of Bayes’ Theorem to Big Data

Discovery

Simple Application

Bayes’ Theorem in Industry

Links and Citations:

Written by Macromoltek, Inc.