# How to use P-values in biology

After completing an experiment, we’ll usually perform statistical tests to determine whether our results are “significant.” P-values are commonly used to report statistical significance in scientific papers, but biologists have been criticized in recent years for misunderstanding and misusing this statistic. A recent paper in PLOS Biology surveyed the scientific literature and found widespread evidence of “p-hacking”, or the manipulation of experimental parameters, such as sample size and the removal of outlier data points, for the sole purpose of obtaining statistically significant p-values. Below I’ll explain about how to use p-values and their importance in biology research.

### Probability theory: Frequentism vs. Bayesianism

First, it’s important to note that there are several different interpretations of the concept of “probability,” perhaps the two most notable belonging to the Bayesian and Frequentist schools of statistics. According to the Bayesian approach (developed by 18th century mathematician Thomas Bayes), probability is best thought of as the likelihood of a particular outcome, given our prior knowledge of the situation in addition to newly acquired data. To give a commonplace example: when searching for a lost set of keys in your home, you will want to estimate the probability that they are in a given location — most likely by remembering previous occasions that the keys were lost and where they were recovered. This “prior” knowledge will factor heavily into your probability estimate. You can then contribute new data to update this probability estimate, for instance if you know with certainty that the keys are not in your pants pocket. The Bayesian interpretation of probability aligns more with our common, everyday usage of the term.

However, the understanding of probability that dominates in the biological sciences is known as Frequentism; most p-value statistics in biological research are computed using this school’s methods. According to frequentist statistics, the probability of a given event is simply the frequency with which it occurs. To give a simple example: If a coin is flipped 100 times and lands “heads” on 58 flips, the probability of the coin’s landing heads is 0.58. Presumably, as the number of coin flips approaches infinity, the observed frequency of heads will approach the “true” probability of 0.5. Frequentism is based on the notion that repeated randomized trials, or experiments, will in the long run approximate the true probability of an event.

### So what is a p-value?

When running an experiment, a biologist may want to know the probability of her hypothesis being true, given the experimental data she observes. However, a p-value calculated using a standard t-test would tell her the converse of this: the probability of observing the experimental data, given the null hypothesis being true. A common experimental “null hypothesis” is a statement of no relationship between the variables under observation (e.g. the means of two data sets are roughly equal). The p-value is therefore the probability of observing the experimental data or a data set more extreme, when assuming that this null hypothesis is correct – a lower p-value makes a stronger case to reject this null hypothesis.