When Transformation Isn't the Answer
By the end of this module, you will:
Parametric tests (t-test, ANOVA) make assumptions:
Non-parametric tests are "assumption-free":
The Trade-off:
Non-parametric tests are ~95% as powerful as parametric tests when data ARE normal, but can be MORE powerful when data are skewed or have outliers!
| Research Question | Parametric Test | Non-Parametric Alternative |
|---|---|---|
| Compare 2 independent groups | Independent t-test | Mann-Whitney U test (also called Wilcoxon rank-sum) |
| Compare 2 paired/matched samples | Paired t-test | Wilcoxon signed-rank test |
| Compare 3+ independent groups | One-way ANOVA | Kruskal-Wallis test |
| Compare 3+ related groups | Repeated-measures ANOVA | Friedman test |
| Correlation between 2 variables | Pearson's r | Spearman's rho (Ο) |
Non-parametric tests convert your data to ranks before analysis:
Example:
Original data: 50, 80, 85, 200, 220
Ranks: 1, 2, 3, 4, 5
Why this matters:
What are you testing?
Non-parametric tests ask: "Do the distributions differ?" rather than "Do the means differ?"
Let's generate some data and run BOTH tests to see how they compare!
When assumptions are met, both tests should agree.
When data violate assumptions, non-parametric tests often perform better.
Question 1: Compare the results from Scenario 1 (normal data). Did both tests reach the same conclusion? Were the p-values similar?
Question 2: Now look at Scenario 2 (skewed data). How did the results differ? Which test seems more appropriate for these data?
Question 3: Based on these demonstrations, when would you choose Mann-Whitney over a t-test?
1. Data are clearly non-normal AND small sample
n < 30 and severe skewness or outliers β Use non-parametric
2. Data are ordinal
Likert scales (1-5 ratings), rankings β Non-parametric is appropriate
3. Transformation doesn't work or is too complex
Tried log, sqrt, etc., but still not normal β Use non-parametric
4. You want robustness
Worried about outliers influencing results β Non-parametric is more robust
5. Your research question is about distributions, not just means
"Do groups differ?" is broader than "Do means differ?" β Non-parametric
1. You have large samples (n > 50-100) with mild violations
Parametric tests are robust; transformation may be easier
2. Your field expects parametric tests
Consider using parametric + reporting assumption checks + sensitivity analysis
3. You need specific comparisons (post-hocs)
Parametric post-hoc tests are more developed than non-parametric equivalents
4. Interpretation is important
Means are easier to interpret than "sum of ranks"
Step 1: Check normality
Step 2: Decide based on results + sample size
Step 3: Report clearly
Always report which test you used and why!
Scenario A: You're comparing pain ratings (1-10 scale) between treatment and control groups (n=25 each). Data are skewed right. What do you do?
Scenario B: You have reaction time data (n=80 per group). Right-skewed. Shapiro-Wilk p = 0.001. Q-Q plot shows moderate deviation. What do you do?
Scenario C: Survey with satisfaction ratings (Very Dissatisfied to Very Satisfied, n=200). Compare 3 departments. What test?
Scenario D: Income data from 40 people, comparing two cities. Heavily right-skewed with outliers (a few millionaires). What do you do?
Bad example:
"We used a Mann-Whitney test. p = 0.03."
Good example:
"Data were not normally distributed (Shapiro-Wilk: W = 0.89, p = 0.003) and transformation did not improve normality. We therefore used the Mann-Whitney U test to compare groups. The treatment group (Mdn = 85) scored significantly higher than the control group (Mdn = 72), U = 245, p = .031, r = .34."
Key elements to include:
β Non-parametric tests are not "inferior"
They're just different! ~95% power when assumptions hold, MORE power when violated.
β They test distributions, not means
The question changes from "Are means different?" to "Are distributions different?"
β Use ranks, not raw values
This makes them robust to outliers and skewness.
β Main alternatives:
β When to use them:
β Always report why you chose non-parametric
Document your assumption checking and decision-making process!
You've now completed the entire normality testing curriculum!
Module 1: Why normality matters (Type I error inflation)
Module 2: Visual detection (histograms, Q-Q plots)
Module 3: Statistical tests (Shapiro-Wilk, large sample paradox)
Module 4: Transformations (log, sqrt, when to transform)
Module 5: Non-parametric alternatives (Mann-Whitney, Kruskal-Wallis)
You now have a complete toolkit for handling normality in your research!
module5_lastname1_lastname2.pdfπ You've completed all 5 modules! π
You're now equipped to handle normality testing like a pro!
Next: Apply these skills to your own research data!