βš™οΈ Module 4: Assumptions & Advanced Topics

When Chi-Square Works (and When It Doesn't)

πŸ“š Learning Objectives

By the end of this module, you will be able to:

βœ… Chi-Square Assumptions

Chi-square tests make several assumptions. Violating these can lead to incorrect p-values!

Assumption 1: Categorical Variables

βœ“ Required: Variables must be categorical (nominal or ordinal)

βœ— Wrong: Don't use chi-square on continuous data

Example error: "Age (years)" should be categorized into age groups first

Assumption 2: Independent Observations

βœ“ Required: Each observation can only be counted once

βœ— Wrong: Same subject measured multiple times, paired data

Example error: Testing 50 people at Time 1 and Time 2 β†’ 100 observations but only 50 independent subjects (use McNemar's test instead!)

Assumption 3: Expected Frequency Rule

βœ“ Required: Expected frequencies β‰₯ 5 in ALL cells

This is the most commonly violated assumption and the main focus of this module!

Assumption 4: Random Sampling

βœ“ Required: Data should come from random or representative sampling

Less about the statistical test and more about study design

πŸ”’ The Expected Frequency β‰₯ 5 Rule

Critical Rule: ALL expected frequencies must be β‰₯ 5

⚠️ Why This Matters

Chi-square distribution is an approximation that works well when expected frequencies are large enough. When expected frequencies are too small (< 5), the approximation breaks down and p-values become unreliable.

Result of violation: Type I error rate increases (false positives!)

How to Check:

# After running chi-square test result <- chisq.test(data) # Check expected frequencies result$expected # Look for any values < 5

Example: Checking Expected Frequencies

Improved No Change
Treatment A 8.5 11.5
Treatment B 3.2 6.8

Problem: Treatment B / Improved cell has expected frequency of 3.2 < 5

Solution needed: Use Fisher's exact test instead!

🚨 Common Misconception

WRONG: "All OBSERVED frequencies must be β‰₯ 5"

CORRECT: "All EXPECTED frequencies must be β‰₯ 5"

It's okay to have observed frequencies of 0, 1, 2, etc. The rule is about EXPECTED frequencies!

🎯 Fisher's Exact Test: The Solution for Small Samples

When expected frequencies are < 5, use Fisher's exact test instead of chi-square.

Chi-Square Test

  • Uses approximation
  • Requires expected freq β‰₯ 5
  • Works with any table size
  • Faster computation

Fisher's Exact Test

  • Calculates exact probability
  • No frequency requirements
  • Best for 2Γ—2 tables
  • Slower for large tables

Running Fisher's Exact Test in R:

# Same syntax as chi-square! data <- matrix(c(8, 3, 12, 7), nrow = 2, byrow = TRUE) rownames(data) <- c("Treatment A", "Treatment B") colnames(data) <- c("Improved", "No Change") # Run Fisher's exact test fisher.test(data) # For tables larger than 2x2, R may need simulation: fisher.test(data, simulate.p.value = TRUE)

Example Output

Fisher's Exact Test for Count Data data: data p-value = 0.3561 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.3664 11.8439 sample estimates: odds ratio 1.5556

Interpretation:

  • p-value = 0.356: No significant association between treatment and outcome
  • Odds ratio = 1.56: Treatment A has 1.56Γ— the odds of improvement compared to Treatment B (but not significant)
  • 95% CI includes 1.0: Confirms non-significance

🌳 Decision Tree: Which Test Should I Use?

Is your DV categorical?
↓ YES
Are observations independent? (Each subject counted once)
↓ YES
One or two categorical variables?
↓ ONE variable
Goodness-of-Fit
Check expected freq β‰₯ 5
Use chisq.test()
↓ TWO variables
Test of Independence
β†’ See next step
For Test of Independence: Check expected frequencies
↓ All β‰₯ 5
Use Chi-Square
chisq.test()
↓ Any < 5
Use Fisher's Exact
fisher.test()
Is it a 2Γ—2 table?
↓ YES
Yates' correction
Applied automatically
(more conservative)
↓ NO (larger)
Regular chi-square
No correction needed

πŸ“ Yates' Continuity Correction

For 2Γ—2 tables only, R applies Yates' continuity correction by default.

What is Yates' Correction?

Chi-square is a continuous distribution used to approximate a discrete distribution (your frequency counts). For 2Γ—2 tables with df=1, this approximation can be poor.

Yates' correction: Adjusts the chi-square formula to be more conservative (reduces Type I error).

Effect: Slightly larger p-values (harder to reject Hβ‚€)

# With Yates' correction (default for 2Γ—2) chisq.test(data) # Correction applied automatically # Without Yates' correction chisq.test(data, correct = FALSE) # Compare the results

Example: Impact of Yates' Correction

Test Version χ² p-value
With Yates' correction 3.42 0.064
Without correction 4.57 0.032

Impact: Without correction β†’ significant; With correction β†’ non-significant

Recommendation: Use the default (with correction) for 2Γ—2 tables, especially with smaller samples (n < 100)

πŸ”§ Strategies When Expected Frequencies Are Too Small

If you have expected frequencies < 5, you have several options:

1Use Fisher's Exact Test

Best option: Always valid, no assumptions about frequencies

Limitation: Computationally intensive for large tables

fisher.test(data)

2Combine Categories

When appropriate: Categories are theoretically similar

Example: Combine "Slightly Improved" and "Greatly Improved" into "Improved"

# Original: 3 outcome categories data_3cat <- matrix(c(5, 3, 2, 8, 7, 5), nrow = 2, byrow = TRUE) colnames(data_3cat) <- c("Worse", "Same", "Better") # Combine "Worse" and "Same" into "Not Better" data_2cat <- matrix(c(8, 2, 15, 5), nrow = 2, byrow = TRUE) colnames(data_2cat) <- c("Not Better", "Better") # Now expected frequencies may be large enough chisq.test(data_2cat)

⚠️ Warning

Don't combine categories JUST to get significance! Only combine when theoretically justified.

3Collect More Data

The most straightforward solution: increase sample size

When to use: Study is ongoing, pilot data suggests small cells

4Remove Rare Categories

When appropriate: Category has very few observations and isn't central to research question

Example: If studying 3 common species and 1 rare species with only 2 observations, might exclude the rare species

Report clearly: "One species with n=2 was excluded from analysis due to small sample size"

🎯 Practice Problem 1: Troubleshooting Small Frequencies

Scenario: Testing if habitat choice differs between two bird species

Forest Grassland Wetland
Species A 18 12 2
Species B 15 10 3
# Create data habitat_data <- matrix(c(18, 12, 2, 15, 10, 3), nrow = 2, byrow = TRUE) rownames(habitat_data) <- c("Species A", "Species B") colnames(habitat_data) <- c("Forest", "Grassland", "Wetland") # Try chi-square result <- chisq.test(habitat_data) result # Check expected frequencies result$expected # Look for values < 5

πŸ” Post-Hoc Tests for Larger Tables

When chi-square is significant with tables larger than 2Γ—2, you know variables are related, but WHERE is the association?

Strategy 1: Examine Standardized Residuals

We covered this in Module 3 - this is your first step!

# Cells with |residual| > 2 or 3 drive the effect result$stdres

Strategy 2: Conduct Follow-Up Chi-Square Tests

Break your large table into smaller 2Γ—2 comparisons

Example: 3Γ—2 Table (Three Treatments Γ— Success/Failure)

Overall test: χ²(2) = 12.5, p = .002 (significant)

Question: Which treatments differ from each other?

# Original 3Γ—2 table full_data <- matrix(c(45, 15, 30, 30, 20, 40), nrow = 3, byrow = TRUE) rownames(full_data) <- c("Treatment A", "Treatment B", "Treatment C") colnames(full_data) <- c("Success", "Failure") # Compare Treatment A vs B AB <- full_data[1:2, ] chisq.test(AB) # Compare Treatment A vs C AC <- full_data[c(1,3), ] chisq.test(AC) # Compare Treatment B vs C BC <- full_data[2:3, ] chisq.test(BC) # IMPORTANT: With 3 comparisons, consider Bonferroni correction # Adjusted alpha = 0.05 / 3 = 0.017

⚠️ Multiple Comparisons Problem

Each additional test increases chance of Type I error (false positive)

Bonferroni correction: Divide alpha by number of comparisons

  • 3 comparisons: Ξ± = 0.05/3 = 0.017
  • 6 comparisons: Ξ± = 0.05/6 = 0.008

Only call results significant if p < adjusted alpha

Strategy 3: Focus on Planned Comparisons

Better than testing all possible pairs: decide BEFORE seeing data which comparisons matter

Example: Testing 4 treatments including a control

Planned comparisons:

  • Treatment A vs Control
  • Treatment B vs Control
  • Treatment C vs Control

Skip the treatment-to-treatment comparisons unless theoretically important

πŸ› Common Errors & Troubleshooting

Error 1: "Chi-squared approximation may be incorrect"

Cause: Expected frequencies < 5

Solution: Use Fisher's exact test or combine categories

Error 2: "x must be non-negative and finite"

Cause: Negative values or missing data in your table

Solution: Check for data entry errors, remove NAs

# Remove missing values clean_data <- na.omit(your_data) # Check for negatives summary(your_data)

Error 3: "arguments imply differing number of rows"

Cause: Trying to create table with unequal vector lengths

Solution: Verify all rows have same number of columns

Warning: "'simulate.p.value' was set but ignored"

Cause: Using simulate argument with regular chi-square instead of Fisher's test

Solution: Use simulation only with fisher.test()

πŸš€ Advanced Topics

McNemar's Test for Paired Data

When observations are NOT independent (same subjects measured twice)

Example: Before/After Treatment

50 patients tested before and after treatment (Pass/Fail)

After: Pass After: Fail
Before: Pass 20 5
Before: Fail 18 7
# McNemar's test for paired data paired_data <- matrix(c(20, 5, 18, 7), nrow = 2) mcnemar.test(paired_data) # DO NOT use regular chi-square for paired data!

Cochran-Mantel-Haenszel Test

Testing association while controlling for a third variable (stratified analysis)

# Example: Treatment Γ— Outcome, controlling for Sex library(stats) mantelhaen.test(array_data) # 3D array: rows Γ— cols Γ— strata

Trend Tests (Cochran-Armitage)

Testing for linear trend when one variable is ordinal

Example: Does disease prevalence increase with age category?

Age categories: Young β†’ Middle β†’ Old (natural ordering)

🎯 Practice Problem 2: Complete Analysis Decision

For each scenario, decide which test to use and explain why.

Scenario A: 100 patients, 2 treatment groups, 3 outcome categories. All expected frequencies between 8-15.

Scenario B: 30 animals, 2 species, 2 habitat choices. Expected frequencies: 7.5, 7.5, 7.5, 7.5

Scenario C: 25 animals, 2 species, 4 habitat choices. Expected frequencies range from 2.1 to 5.8

🎯 Complete Analysis Workflow

1Understand Your Research Question

  • One or two categorical variables?
  • What are you testing? (equal distribution? association?)

2Check Assumptions

  • βœ“ Categorical variables
  • βœ“ Independent observations
  • βœ“ Random/representative sample

3Create Table & Run Preliminary Test

result <- chisq.test(data) result$expected # Check this first!

4Verify Expected Frequencies

  • All β‰₯ 5? β†’ Proceed with chi-square
  • Any < 5? β†’ Use Fisher's exact test OR combine categories

5Interpret Results

  • Look at χ², df, p-value
  • Calculate effect size (CramΓ©r's V for test of independence)
  • Examine standardized residuals for patterns

6Visualize

  • Bar plots for goodness-of-fit
  • Grouped bar plots or mosaic plots for independence tests

7Report

  • Test name and purpose
  • Test statistic, df, p-value
  • Effect size
  • Pattern description with frequencies/percentages

🎯 Comprehensive Final Problem

Scenario: Testing if stress level affects immune response in 120 participants

Strong Response Moderate Response Weak Response Total
Low Stress 28 18 4 50
High Stress 12 25 33 70
Total 40 43 37 120

Complete these tasks:

  1. Run chi-square test
  2. Check expected frequencies
  3. Calculate CramΓ©r's V
  4. Examine standardized residuals
  5. Create visualization
  6. Write complete results in APA style
# Your complete analysis here stress_data <- matrix(c(28, 18, 4, 12, 25, 33), nrow = 2, byrow = TRUE) rownames(stress_data) <- c("Low Stress", "High Stress") colnames(stress_data) <- c("Strong", "Moderate", "Weak") # Step 1: Run test result <- chisq.test(stress_data) result # Step 2: Check expected frequencies result$expected # Step 3: Calculate CramΓ©r's V chi_sq <- result$statistic n <- sum(stress_data) k <- min(nrow(stress_data), ncol(stress_data)) V <- sqrt(chi_sq / (n * (k - 1))) V # Step 4: Examine residuals result$stdres # Step 5: Visualize barplot(stress_data, beside = TRUE, col = c("#c8e6c9", "#ffcdd2"), legend = rownames(stress_data), xlab = "Immune Response", ylab = "Frequency", main = "Immune Response by Stress Level") # Alternative: proportions prop_data <- prop.table(stress_data, margin = 1) barplot(prop_data, beside = TRUE, col = c("#c8e6c9", "#ffcdd2"), legend = rownames(stress_data), xlab = "Immune Response", ylab = "Proportion", main = "Immune Response by Stress Level (Proportions)")

πŸ€” Final Check Your Understanding

Question 1: You have a 2Γ—3 table with n=40. One expected frequency is 4.2. What should you do?

A) Proceed with chi-square; 4.2 is close enough to 5
B) Use Fisher's exact test instead
C) Increase alpha to 0.10 to compensate

Question 2: For a 2Γ—2 table, should you turn off Yates' correction?

A) Yes, always turn it off for more power
B) No, keep it on (default) for more conservative test
C) Only turn it off if sample size > 1000

Question 3: You test 50 patients before and after treatment (same patients). Which test?

A) Chi-square test of independence
B) McNemar's test for paired data
C) Fisher's exact test

πŸ“ Module 4 Summary

Key Takeaways:

πŸŽ‰ Congratulations! You've completed all four Chi-Square modules!
You now have the skills to analyze categorical data correctly and confidently.

πŸ“‹ Quick Reference Card

Situation Test to Use R Code
One variable, test distribution Goodness-of-fit chisq.test(obs, p=...)
Two variables, all expected β‰₯ 5 Test of independence chisq.test(data)
Two variables, any expected < 5 Fisher's exact fisher.test(data)
Paired/repeated measures McNemar's test mcnemar.test(data)
Check effect size CramΓ©r's V sqrt(χ²/(n*(k-1)))
Find pattern Standardized residuals result$stdres

Always remember:

  1. Check assumptions FIRST (especially expected frequencies)
  2. Report effect size, not just p-values
  3. Visualize your data
  4. Interpret patterns, don't just report statistics
  5. Consider practical significance alongside statistical significance