top of page
Kruskal-Wallis Test

The Kruskal-Wallis test is a non-parametric statistical test that is used as an alternative to the one-way ANOVA when the assumptions necessary for ANOVA are not met. It's particularly useful in hypothesis testing with normal data under the sub-topic of Parametric Tests for Normal Data within the framework of Lean Six Sigma methodologies. Despite being categorized under parametric tests for its applications, it is important to clarify that the Kruskal-Wallis test itself is non-parametric. This test is designed to compare the medians from two or more independent samples to determine if at least one sample median is different from the others, without assuming a normal distribution of the data.

Understanding the Kruskal-Wallis Test

The Kruskal-Wallis test is based on ranks of data rather than the data points themselves. This characteristic makes it robust against non-normal data distributions and is a suitable test for data that do not meet the assumptions of homogeneity of variances or data that are ordinal. In the context of Lean Six Sigma, it can be particularly useful in analyzing the effects of various factors on a process where the data may not necessarily follow a normal distribution or where the sample sizes are small.

How the Kruskal-Wallis Test Works

  1. Ranking the Data: Initially, all data points from the combined samples are ranked together. The lowest value gets the rank of 1, the next lowest the rank of 2, and so on. In case of ties (identical values), average ranks are assigned.


  2. Calculating Test Statistic (H): The Kruskal-Wallis test statistic, denoted as H, is calculated based on the ranks. The formula for H takes into account the sum of ranks for each group, the overall number of observations, and the number of observations within each group. The H statistic is sensitive to differences in the distribution of ranks among the groups.

  3. Determining Significance: After calculating the H statistic, it is compared against a critical value from the Chi-Square distribution with (k-1) degrees of freedom, where k is the number of groups. If the calculated H is greater than the critical value, the null hypothesis that all group medians are equal is rejected.


Applications in Lean Six Sigma

In Lean Six Sigma projects, the Kruskal-Wallis test can be particularly helpful in the Analyze phase of the DMAIC (Define, Measure, Analyze, Improve, Control) methodology. It can be used to analyze the impact of non-normal and ordinal data across different groups or categories on a process outcome. This can inform decision-making about potential areas for process improvement.


Advantages and Limitations


  • Advantages:

    • Does not require the assumption of normal distribution.

    • Can handle data with outliers or non-homogeneous variances.

    • Suitable for both ordinal data and continuous data that is not normally distributed.


  • Limitations:

    • Less powerful than ANOVA when the assumption of normality is met.

    • Only indicates that at least one group is different but does not identify which groups are different.


Conclusion

The Kruskal-Wallis test is a valuable tool in the Lean Six Sigma toolkit for hypothesis testing, especially when dealing with non-normal or ordinal data. It offers a non-parametric alternative to the one-way ANOVA, allowing practitioners to make informed decisions about process improvements even in the face of data that violate the assumptions required for parametric tests. Understanding and applying the Kruskal-Wallis test can enhance the analytical capabilities of Lean Six Sigma projects, leading to more effective identification and implementation of process improvements.


Kruskal-Wallis Scenario

A retail company wants to understand if there's a difference in sales performance across three of its stores in different locations over a particular month. Due to various factors, including seasonal variation and local events, the sales data does not follow a normal distribution. The company decides to use the Kruskal-Wallis test to analyze the sales data.


Data:

  • Store A: $1200, $1300, $1250

  • Store B: $1400, $1350, $1500

  • Store C: $1100, $1150, $1050


Steps for Kruskal-Wallis Test:

1. Rank all data points:

Combine all sales figures and rank them from lowest to highest:

  • $1050 (C) - Rank 1

  • $1100 (C) - Rank 2

  • $1150 (C) - Rank 3

  • $1200 (A) - Rank 4

  • $1250 (A) - Rank 5

  • $1300 (A) - Rank 6

  • $1350 (B) - Rank 7

  • $1400 (B) - Rank 8

  • $1500 (B) - Rank 9


2. Sum of ranks for each group:

  • Store A: 4 + 5 + 6 = 15

  • Store B: 7 + 8 + 9 = 24

  • Store C: 1 + 2 + 3 = 6


3. Calculate the test statistic (H):

The formula for H is:

Where:

  • N is the total number of observations (9 in this case),

  • ni is the number of observations in group i,

  • Ri is the sum of ranks for group i.


Let's calculate H:

H=7.2


4. Determine Significance:

Assuming a significance level (α) of 0.05 and using the Chi-Square distribution with 2 degrees of freedom (since there are 3 groups - 1), we check the critical value for χ²(2, 0.05). The critical value from a Chi-Square table is approximately 5.991



5. Conclusion:

Since the calculated H value of 7.2 is greater than the critical χ² value of 5.99, we reject the null hypothesis. This indicates that there is a statistically significant difference in the sales performance among the three stores.


Summary:

Through the Kruskal-Wallis test, the retail company found that not all stores performed equally in terms of sales, suggesting the need for a deeper dive into the factors influencing these differences, such as location-specific promotions, local events, or customer preferences. This step-by-step example showcases how the Kruskal-Wallis test can be applied to real-world, non-normally distributed data to guide business decisions.

Kruskal-Wallis Test





Curent Location

/412

Article

Rank:

Kruskal-Wallis Test

305

Section:

LSS_BoK_3.5 - Hypothesis Testing with Non-Normal Data

E) Non-Parametric Tests for Hypothesis Testing

Sub Section:

Previous article:

Next article:

bottom of page