Hypothesis testing is used in the Six Sigma Analyze Phase for screening potential causes. A hypothesis test calculates the probability, p, that an observed difference between two or more data samples can be explained by random chance alone, as opposed to any fundamental difference between the underlying populations that the samples came from. So hypothesis testing answers the question, *what is the probability that these data samples actually came from the same underlying population?* If this probability, known as the p-value, is small (typically below 0.05), then we conclude that the two samples likely came from different underlying populations. For example, a p-value of 0.02 indicates that there is only a 2% chance that the data samples came from the same underlying population.

Here are a few situations where hypothesis testing assists us in the problem solving process:

• Evaluating a proposed process improvement to see if its effect is statistically significant, or if the same improvement could have occurred by random chance.

• Evaluating several process factors (process inputs, or x’s) in a designed experiment to understand which factors are significant to a given output, and which are not.

• Understanding the likelihood that a data sample comes from a population that follows a given probability distribution (i.e. normal, exponential, uniform, etc.).

An individual untrained in basic statistical knowledge might naturally question the need for a hypothesis test: “Why can’t we simply compare the average values of a given CTQ, before and after a process change, to determine if the change we made actually made a difference?” The answer is that the supposed improvement we observe might have nothing to do with the change we made to the process, and might have everything to do with chance variation. In other words, the two data sets might actually have come from the same underlying population.

Statistical Significance Vs. Practical Significance

There are many situations where a process change has a *statistically significant* effect on a CTQ, but an insignificant effect in *real world* terms. For example, an individual working to improve his or her vehicle’s fuel economy might run a hypothesis test comparing fuel economy at driving speeds of 60 mph and 70 mph on the highway. The result might show that driving at the lower speed has a statistically significant effect on the CTQ, which in this case is miles-per-gallon fuel economy. However, the actual improvement in fuel economy might only be 0.5 miles per gallon, which might be deemed not worth the extra time it will take to get to work each day.