Chi Square Test

The chisquare independence test is a procedure for testing if two categorical variables are related in some population.

Here is handwritten example : https://github.com/arcarchit/datastories/blob/master/notes/chi2.pdf

Chi square distribution

  • Chi square distribution
    • Squaring samples from standard normal distribution [0]
    • Distribution changes with degrees of freedom
    • When DoF = 1 it is more concentrated around 0
  • It is distribution is sum of squares
    • When dice is biased sum of squares will be higher. Hence more significant.
    • When it is fair it will be closed to zero. Difference is with expected value.
d1
d2

Chi Square Test for Equality of Proportions

h1

Chi square vs T test

  • When to use which one
  • T-test is used to compare mean of two distributions
  • Chi square is used to check whether observation gathered of categorical data meets the assumption

Chi Square for goodness of fit testing

  • Chi Square Goodness of fit
    • Restaurant example
    • H0 = Percentage given by customer is correct
  • We calculate expected for each cell and calculate chi^2

Chi Square for relationship testing

  • H0 : Variables are independent of each other
  • It helps testing if two categorical variables are related
  • Calculate Chi square statistics by summing all cells and check against degree’s of freedom
  • Examples
    • Hypothesis testing :
      • H0 = Herbs1, Herb2, placebo are same
      • H0 = Herbs do nothing
      • We can’t say herb does nothing
        • We are working on accumulated data here
        • Whereas ANOVA is about variancei1
    • Homogeneity testing  :
      • H0 = Left and Right handed people have same preference for arts, science
      • H0 = Preference of arts/science is independent of natural hand left/right
      • H0 = Variables are independent
      • Filling up table
        • P(STEM | right) = P(STEM)
        • x / 60 = 40/100 => x = 40 * 60 / 100 = 24
        • We can also say that value of cell is product of marginals divide by total
      • Degrees of freedom = (r-1)*(c-1)  = 2 * 1 = 2
i3

References

[0] : https://www.khanacademy.org/math/statistics-probability/inference-categorical-data-chi-square-tests#chi-square-goodness-of-fit-tests

[1] : https://www.khanacademy.org/math/statistics-probability/inference-categorical-data-chi-square-tests#chi-square-goodness-of-fit-tests

[2] : https://biology.stackexchange.com/questions/13486/deciding-between-chi-square-and-t-test

[3] : https://fhssrsc.byu.edu/SitePages/ANOVA,%20t-tests,%20Regression,%20and%20Chi%20Square.aspx

Leave a comment