top of page
Selecting the Right Test

In the realm of Lean Six Sigma, hypothesis testing plays a crucial role in identifying and validating factors that can lead to improvements in process performance and quality. When it comes to data that follows a normal distribution, selecting the right hypothesis test is essential for accurate analysis and decision-making. This article aims to guide you through the process of choosing the appropriate statistical test for hypothesis testing with normal data, within the context of Lean Six Sigma projects.


Understanding Hypothesis Testing

Hypothesis testing is a statistical method used to make decisions about a population based on sample data. In Lean Six Sigma projects, it helps in determining whether a process improvement or change has significantly impacted the process output. The process begins with the formulation of two hypotheses: the null hypothesis (H0), which states that there is no effect or difference, and the alternative hypothesis (H1), which states that there is an effect or difference.


The Importance of Normal Data

Normal data, or data that follows a normal distribution, is a common assumption in many statistical tests because of the Central Limit Theorem. This theorem suggests that, given a large enough sample size, the distribution of the sample mean will approximate a normal distribution, regardless of the distribution of the population. Normal data allows for the use of parametric tests, which are generally more powerful and precise than non-parametric tests.


Steps in Selecting the Right Test

1. Define the Objective

The first step in selecting the right test is to clearly define the objective of the hypothesis testing. Are you trying to compare means or proportions? Are you looking at the relationship between two variables? The objective will guide the selection of the test.


2. Verify Assumptions

Before choosing the test, verify that your data meets the assumptions required for parametric tests. The main assumptions include normality, homogeneity of variance, and independence of observations. Tools like the Shapiro-Wilk test for normality and Levene's test for equality of variances can help verify these assumptions.


3. Consider the Data Type and Design

The type of data (continuous or discrete) and the design of the study (independent or paired samples) play a significant role in test selection. For continuous data from independent samples, the t-test is a common choice. For paired or matched samples, the paired t-test is appropriate.


4. Decide the Test

Determine whether to use a one-tailed or two-tailed test based on the research hypothesis. A two-tailed test is used when we are interested in detecting any significant difference, regardless of direction. A one-tailed test is used when the direction of the difference is specified in the hypothesis.


Mann-Whitney U Test (also known as the Wilcoxon Rank-Sum Test)

Purpose

  • Compares the distributions of two independent samples to determine if they come from the same distribution.

Assumptions

  • Independent samples, ordinal or continuous data, and the samples do not necessarily follow a normal distribution.

Use Case

  • Comparing customer satisfaction scores from two different stores where the data are not normally distributed.

Wilcoxon Signed-Rank Test

Purpose

  • Compares the median of paired samples to assess whether their population median differences are zero.

Assumptions

  • Paired samples, ordinal or continuous data that are not normally distributed, and the differences between pairs can be ranked.

Use Case

  • Measuring the impact of a new HR policy on employee productivity before and after its implementation with non-normal difference in scores.

Kruskal-Wallis H Test

Purpose

  • Extends the Mann-Whitney U test to compare the distributions of three or more independent samples.

Assumptions

  • Independent samples, ordinal or continuous data, and the data do not follow a normal distribution.

Use Case

  • Evaluating the effectiveness of different training programs (more than two) on employee performance where data are skewed.


Friedman Test

Purpose

  • Non-parametric alternative to the repeated measures ANOVA, comparing the rankings of three or more paired groups.

Assumptions

  • Paired or matched observations, ordinal or continuous data, and the observations do not follow a normal distribution.

Use Case

  • Assessing the performance of a group of employees over three different time periods (e.g., before training, immediately after, and six months later) with non-normal data.

Spearman's Rank Correlation

Purpose

  • Measures the strength and direction of the association between two ranked variables.

Assumptions

  • Ordinal or continuous data that are not normally distributed, and the relationship between variables is monotonic.

Use Case

  • Investigating the relationship between the rank of employees' satisfaction and their productivity levels where data are not linearly related or normally distributed.

Chi-Square Test of Independence

Purpose

  • Assesses whether there is a significant association between two categorical variables.

Assumptions

  • Observations are independent, and data are categorical. It does not assume normality but requires an adequate sample size for expected frequencies in the contingency table.

Use Case

  • Determining if there’s an association between department types and the presence of a specific type of defect in products.


Fisher's Exact Test

Purpose

  • Used to examine the significance of the association between two kinds of categorical data in a 2x2 contingency table, especially useful when sample sizes are small.

Assumptions

  • Independent observations and categorical data. Suitable for small sample sizes where the Chi-Square test assumptions may not hold.

Use Case

  • Comparing the occurrence of a rare event (e.g., a specific type of defect) between two small production batches.


Kolmogorov-Smirnov Test

Purpose

  • Used to determine if two samples come from the same distribution. It can also be used to compare a sample with a reference probability distribution (one-sample K-S test).

Assumptions

  • Two samples are independent. It does not assume normality of the data.

Use Case

  • Comparing the distribution of process output times before and after a change when the distribution's shape is of interest, not just central tendency.


Mood's Median Test

Purpose

  • Non-parametric test that assesses whether two or more groups come from populations with the same median.

Assumptions

  • Independent samples from two or more groups. The test does not assume normality and is less sensitive to outliers compared to mean-based tests.

Use Case

  • Determining if the median processing time differs across several production lines when the data are non-normally distributed.


Cochran's Q Test

(Not required for the Black Belt)

Purpose

  • A non-parametric test used to determine if there are differences between three or more matched sets of frequencies or proportions.

Assumptions

  • The observations are in matched pairs or sets, and the data are binary (e.g., pass/fail, yes/no).

Use Case

  • Evaluating the consistency of binary outcomes (e.g., pass/fail) across multiple tests or conditions for the same subjects.


Run Test (or Wald-Wolfowitz Runs Test)

Purpose

  • Used to examine the randomness of a data sequence. The test checks whether the occurrence of elements in the sequence is random.

Assumptions

  • The data are ordinal, and the test does not assume normality. It looks for patterns or sequences that would be unlikely in a random distribution.

Use Case

  • Analyzing sequences of defect occurrences in a production process to determine if defects occur randomly over time.


Durbin-Watson Test

(Not required for the Black Belt)

Purpose

  • Tests for the presence of autocorrelation (a relationship between values separated from each other by a given time lag) in the residuals from a statistical regression analysis.

Assumptions

  • The test is used on residuals from a regression analysis, and it assumes linear relationships among variables. It does not directly test normality but rather the independence of errors.

Use Case

  • Assessing if consecutive errors in a process performance metric, predicted through regression, are correlated over time.


Which test statistic should you use in a Flowchart ?

----

Please understand that the chart below is essential knowledge for any Black Belt.

It is imperative to select the appropriate test for your data.

Attempting to pass your exam without this understanding is a guaranteed path to failure.

----

Source:https://onishlab.colostate.edu/summer-statistics-workshop-2019/which_test_flowchart/

Please note that McNemar's test is not encompassed within the Black Belt Body of Knowledge.Even more complete chart here:

(Not required Knowledge for the Black Belt Body of Knowledge.)https://statsandr.com/blog/what-statistical-test-should-i-do/


Conclusion

Selecting the right statistical test for hypothesis testing with normal data is a critical step in the Lean Six Sigma methodology. It ensures that the conclusions drawn from the data are valid and reliable. By clearly defining the objective, verifying assumptions, considering the data type and design, and deciding on the nature of the test (one-tailed vs. two-tailed), practitioners can make informed decisions about which statistical test to use. This careful selection process supports the overall goal of Lean Six Sigma projects: to improve process performance and quality based on data-driven decisions.

Curent Location

/412

Article

Rank:

Selecting the Right Test

292

Section:

LSS_BoK_3.4 - Hypothesis Testing with Normal Data

E) Hypothesis Testing Procedure

Sub Section:

Previous article:

Next article:

bottom of page