E-Mail:
Author Avatar

Bayesian Theory Basics

In this series I will often refer to Bayesian theory, but some readers might not be familiar with the concept. In essence, this theory shows how to make a decision given a measurement of some type, the false error rate, and the false positive rate of the measurement, and the a priori distribution of the parameter being measured. So, for instance, if you have a test for cancer, and it comes back positive, to estimate the probability that you really have cancer you need to know the sensitivity of the test (cancer found when cancer is present), the false alarm rate (cancer reported when none is there), and what fraction of the population has cancer. Attempting to estimate the probability of having cancer without knowing all these parameters will lead to an incorrect conclusion.

My tutorial gives an eye-opening example of professional mammographers who overestimated the probability of a woman having breast cancer by a factor of 10! (Not 10%. They were ten times too high in their estimates.).

Another common abuse of statistics is denial of employment through drug screening. This is discussed in detail in the tutorial also, and I might abstract some of that material in later articles. However, today I thought that we could help people get used to the concept by doing some examples.

Few people like to go through the numbers unaided, so I looked for online help, and found a neat site that covers a conference held in 2004. By navigating through the site, or clicking here, you can find an operable calculator that computes the probability that a given hypothesis is correct given a measurement. Forgive me for not copying it here, but that would likely go beyond the fair usage, and I try to respect copyrights. So go and try it on their site.

The calculator itself is simple and anyone could program up a similar thing, but the explanation that goes with it is a good introduction. As with any new discipline, one of the most difficult stumbling blocks for newcomers is the nomenclature. In common language, we often use the words “probability” and “odds” interchangeably, but the decimal form of the probability of an event is a different number than the decimal form of the odds of the same event. Both numbers can convey the correct information if they are understood correctly. Similarly, terms like “a posteriori probability” (the probability of having a disease after you know the results of a test), can confuse. To make matters worse, many disciplines use different nomenclature for the same thing. And, as the example of “probability” versus “odds” shows, different disciplines can concentrate present the same data clothed in different equivalent parameters.

A common source of error in making decisions is failure to consider the a priori distributions. For example, the AIDS test is extremely sensitive. Unlike many clinical tests, it has so few false positives that most physicians will flatly state that if it comes back positive, you got it. That is wrong. Your probability of having the disease given a positive result depends on the frequency of occurrence in your identifiable population group. Imagine a tribe on a desert island that has had no contact with the outside world in one hundred years, so they have not been exposed to AIDS. Now imagine a villager in Africa where over half of the occupants are infected. If each has a positive test, the probability that the islander has AIDS is insignificant compared to the probability the African villager has it. I would strongly suggest the test for the islander had been contaminated and would at the least order a repeat before starting treatment.

Some readers have complained that the common examples of how to use Bayesian decision theory might look good in theory, but that’s not how the real world works. This is a fair comment because the introductory examples are usually highly simplified, but the underlying formalization can be used to make predictions such as what treatments to use for a medical condition and whether to fire the ICBMs.

Let’s agree that the average Joe walking down a street to get to his office will not be mentally applying Bayesian theory to decisions as whether to jaywalk or go to the corner and wait for the light. But if you wanted to build a robot to navigate autonomously to the same office, this type of decision-making process must be programmed in. That means we have to tear apart every decision into its smallest parts and apply some algorithms to build up a series of decisions that will eventually lead to the goal.

In the meanwhile, even the average Joe and Jane should understand enough of the theory to evaluate their own health status. Undergoing unnecessary chemotherapy in no fun, and not undergoing necessary chemo is even less fun.

For those who wish to delve further into decision theory without wading through a lot of equations, I have posted a tutorial on elementary decision theory. It shows examples of faulty physicians’ diagnoses (important for those considering surgery) and how to evaluate anti-terrorist activities (important for everyone). That tutorial can be found here.

What Do You Think?

 


Anti-Spam Image

Want to Start a Blog Here for Free?

Are you an expert in one subject or another? If your goal is to help others and dispense hard-earned information back to the community, stake a claim on your very own Lockergnome blog today! You can write about anything - no matter the topic. Sign-up to start blogging!