Statistical Significance: P-Value and Confidence Interval

Olabode James
4 min readSep 25, 2022

--

P-values (“Probability values”) are one way to test if the result of an experiment is statistically significant. In essence, we like to test if the result of an experiment on a subset of the population(herein called a sample) is important enough for us to make a conclusion on a hypothesis.

Statistical Significance is a way to tell you if your test results are solid. Because statistics isn’t an exact science we can think of stats tests as very finely tuned guesswork. As the stats test is guesswork, we need to know how close our hypothesis is to the real-life occurrence. That’s where significance comes in.

This picture is a visual aid to p-values, using a theoretical experiment for a pizza business.

Below is a visual explanation of the statistical significance of a hypothesis — herewith, customers eat an average of 4 slices of pizza.

Image Source: Datasciencecentral.com

In the end, Statistical significance measures of whether your research findings are meaningful. More specifically, it’s whether your stat closely matches what value you would expect to find in an entire population while having reference only to a subset.

Steps for performing a test of statistical significance —

  1. Decide on an alpha level also called Level of Significance. An alpha level is the error rate you are willing to work with (usually 5% or less). The lower the value the higher the standard for our test, and the margin of allowable mistakes in hypothesis formulation.
  2. Conduct your research. For example, conduct a poll or collect data from an experiment. Random sampling for an experiment.
  3. Calculate and determine the 5 numbers statistics summary on the random sample. A statistic summary is just a piece of information about your sample, like a mean, mode or median.
  4. Determine the p-value and use this to determine the statistical significance of the experiment.

Practical Significance and Statistical Significance — Drawing the Line for Effective Deductions

When the sample statistic is outside of our 95% confidence interval(or less than the level of significance — alpha value), we can effectively reject the Null Hypothesis and call the result statistically significant.

What does “statistically significant” mean? What does it tell us? At first, many people think that a statistically significant result using a 95% confidence interval tells them that there is a 95% chance they’re correct and a 5% chance they’re incorrect. But, unfortunately, that’s not what it means. Its meaning is much more limited. It only tells us that there is a 95% chance we’re correct and a 5% chance we’re incorrect when the Null Hypothesis is true. It’s on this basis that we reject the Null Hypothesis.

The meaning of statistical significance is limited in another way as well. Many people think, at first, that statistical significance tells them that the results must have meaningful real-world implications and that the results are practically significant. But, unfortunately, that’s not what it means either. The term significant is qualified with the term statistically. It doesn’t mean generally significant or practically significant or meaningfully significant.

To illustrate this point further, Statistical significance is important when assessing the results of statistical analysis, but you also need to look at the actual statistic values involved and decide whether they are practically significant, with meaningful real-world implications.

And while we do want to have sample sizes large enough to avoid undue risk of Type II Error, we also have to be wary when using sample sizes so large that negligible results have statistical significance. Here’s an example of that: A study found that a certain dietary supplement lowered the risk of getting a certain minor ailment from 2 in 1000 (0.2%) down to 1 in 1000 (0.1%). The sample size of the study was 30,000, so the difference between 0.2% and 0.1% is statistically significant (at 95% confidence). That gives a relative risk difference of 50% (0.2% — 0.1%)/.2%) but an absolute risk difference of only 0.1% (0.2% — 0.1%). Advertisements for the supplement highlighted the fact that the supplement’s positive effect was statistically significant and that the supplement reduced the risk of getting the ailment by 50%, but the advertisements did not mention that the absolute risk reduction was only 0.1%. Many people would find that misleading. And many people would consider an absolute risk difference of 0.1% to be negligible and practically insignificant.

In conclusion: You definitely want to know both the relative and absolute differences in order to better assess practical significance before embarking on an effort to communicate such a marketing message.

Check the article on how Statistical significance was used to misrepresent Practical significance in the Vioxx Scandal — American FDA approving a drug that should otherwise have been rejected.

--

--

Olabode James

Chief Solutions Architect, My joy is in solving problems ... everything else is eventual!