Module 1: Why Does Normality Matter?

Discovery Lab - Statistical Assumptions

☁️ Working Guidelines

⏱️ Estimated time: 45-50 minutes
👥 Work with your partner—discuss each question!
💾 Your answers are saved automatically in your browser
📄 When finished, use Print (Ctrl+P/Cmd+P) and "Save as PDF" to submit

0% Complete

🎯 Learning Objectives

By the end of this module, you will:

Understand why checking normality assumptions matters for statistical inference
Discover how violations affect Type I error rates
Recognize the consequences of ignoring assumptions
Apply this knowledge to real research scenarios

🔍 The Puzzle

Here's a mystery:

Two researchers analyze the SAME research question using α = .05 (meaning they accept a 5% chance of false positives).

Researcher A gets false positives 5% of the time ✓
Researcher B gets false positives 12% of the time ⚠️

Both used t-tests. Both set α = .05. Why are the results so different?

Your job: Investigate what's happening and discover WHY assumptions matter.

Part 1: Run the Simulation

Let's see this problem in action. Click the buttons below to run the simulations.

Researcher A: Normal Data

Researcher B: Skewed Data

📝 Observe & Record

Question 1: Record the false positive rates you observed:

Researcher A (normal data): %
Researcher B (skewed data): %

Question 2: DISCUSS WITH PARTNER: Why do you think these rates are different? What's the key difference between the two simulations?

Our initial hypothesis:

🤔 Stuck? Click for a hint

Think about:

What type of distribution did Researcher A use? (normal)
What type did Researcher B use? (exponential/skewed)
Which assumption of the t-test might be violated?

Part 2: Investigate the Data

Let's visualize what the data from each researcher actually look like.

📝 Analyze the Visualizations

Question 3: Describe the SHAPE of each histogram:

Researcher A (blue histogram):

Researcher B (red histogram):

Question 4: Look at the Q-Q plots (where points should follow the diagonal line if data are normal).

Researcher A's Q-Q plot:

Researcher B's Q-Q plot:

Question 5: When data are symmetric, how do mean and median compare? When data are skewed?

Part 3: Test Your Understanding

Question 6: PREDICTION TIME!

Imagine you're about to analyze reaction time data. You create histograms and Q-Q plots, and the data look like Researcher B's data (right-skewed).

If you proceed with a t-test anyway, what do you predict will happen?

The false positive rate will be about 5% (as expected)
The false positive rate will be HIGHER than 5%
The false positive rate will be LOWER than 5%

Explain your reasoning (2-3 sentences):

Question 7: Based on everything you've discovered so far, why might Researcher B's inflated false positive rate be a problem in real research?

Part 4: The Power Demonstration

Let's see another consequence: violations can also reduce statistical power (ability to detect real effects).

💡 What does this mean?

When assumptions are violated:

❌ More false positives (Type I errors increase)
❌ Less power to detect real effects (Type II errors increase)
❌ Your research becomes less trustworthy

This is why we check assumptions!

Part 5: Real-World Application

Question 8: You're reviewing a manuscript for a journal. The Methods section says:

"We compared anxiety scores between the treatment and control groups using an independent samples t-test (α = .05). The treatment group showed significantly lower anxiety (p = .042)."

The authors did NOT report checking normality assumptions.

Part A: What would you want to see in the paper to evaluate whether their analysis was appropriate?

Part B: Why is it important that they checked (or didn't check) assumptions, given that they found p = .042 (just barely significant)?

Part C: If you were the reviewer, what would you recommend the authors do?

Part 6: Synthesis & Reflection

Question 9: THE BIG IDEA

Complete this sentence with your partner (3-4 sentences total):

"Checking for normality before running parametric tests matters because..."

Question 10: METACOGNITIVE REFLECTION

Before this module, did you know that violating assumptions could inflate false positive rates? How does this change the way you'll approach your own data analysis in the future?

🎯 Key Discoveries

What We Discovered Today:

✓ Discovery 1: When data violate normality assumptions, the actual Type I error rate can be HIGHER than the α level we set (e.g., 12% instead of 5%)
✓ Discovery 2: Violations also reduce statistical power, making it harder to detect real effects when they exist
✓ Discovery 3: We can SEE these violations using histograms and Q-Q plots before running our tests
✓ Discovery 4: Ignoring assumptions threatens the reliability and reproducibility of research findings

📋 Before You Submit

✅ Submission Checklist

Both partner names are filled in at the top
All 10 questions have written responses
All simulations have been run
You've discussed answers with your partner
Ready to save as PDF!

📤 How to Submit

Click "Save Progress" to ensure everything is stored
Use your browser's Print function: Ctrl+P (Windows) or Cmd+P (Mac)
Choose "Save as PDF" as the destination
Save as: module1_lastname1_lastname2.pdf
Upload the PDF to your course management system

🎉 You've completed Module 1! 🎉

Great work discovering why normality matters.