Outcome | Observed | Expected |
---|---|---|
\(1\) | \(50,611\) | \(50,000\) |
\(2\) | \(49,523\) | \(50,000\) |
\(3\) | \(49,812\) | \(50,000\) |
\(4\) | \(49,924\) | \(50,000\) |
\(5\) | \(49,672\) | \(50,000\) |
\(6\) | \(50,458\) | \(50,000\) |
Total: | \(300,000\) | \(300,000\) |
\(H_0\): we expect to observe a uniform distribution, each die roll is equally likely
\(H_A\): the distribution is not uniform.
\[ \displaystyle \chi^2 = \sum \frac{ (\text{observed}- \text{expected})^2}{\text{expected}}\]
Outcome | Observed | Expected | \(\frac{ (\text{observed}- \text{expected})^2}{\text{expected}}\) |
---|---|---|---|
\(1\) | \(50,611\) | \(50,000\) | \(7.46642\) |
\(2\) | \(49,523\) | \(50,000\) | \(4.55058\) |
\(3\) | \(49,812\) | \(50,000\) | \(0.70688\) |
\(4\) | \(49,924\) | \(50,000\) | \(0.11552\) |
\(5\) | \(49,672\) | \(50,000\) | \(2.15168\) |
\(6\) | \(50,458\) | \(50,000\) | \(4.19528\) |
Total: | \(300,000\) | \(300,000\) | \(19.18636\) |
#p-value for goodness of fit 1-pchisq(X^2, # of bins -1)
1-pchisq(19.18636,5)
## [1] 0.001774384
# p-value of .00177, is much smaller than a significance level of .05. So our data provides statistically significant evidence that die is not following a uniform distribution
Example: Students in grades 4-6 were asked whether good grades, athletic ability, or popularity was most important to them. A table separating the students by grade and by choice of most important factor is shown below. Do these data provide evidence to suggest that goals vary by grade?
Grades | Popular | Sports | Total | |
---|---|---|---|---|
\(4^{th}\) | \(63\) | \(31\) | \(25\) | \(119\) |
\(5^{th}\) | \(88\) | \(55\) | \(33\) | \(176\) |
\(6^{th}\) | \(96\) | \(55\) | \(32\) | \(183\) |
Totals: | \(247\) | \(141\) | \(90\) | \(478\) |
For tests of independence, your hypothesis always look like:
\(H_0:\) grade level and most important thing are independent of each other
\(H_A:\) what you find most important does depend on your grade level
To test this we again want to compute a chi-square statistic \(\chi^2\):
\[ \displaystyle \chi^2= \sum \frac{ (\text{observed}- \text{expected})^2}{\text{expected}}\]
\[ \displaystyle \text{expected} = \frac{ (\text{row total} \cdot \text{column total})}{\text{table total}}\]
Pressure to Succeed | High Anxiety | Medium-High Anxiety | Medium Anxiety | Medium-Low Anxiety | Low Anxiety | Total |
---|---|---|---|---|---|---|
High | \(35\) | \(42\) | \(53\) | \(15\) | \(10\) | \(155\) |
Medium | \(18\) | \(48\) | \(63\) | \(33\) | \(31\) | \(193\) |
Low | \(4\) | \(5\) | \(11\) | \(15\) | \(17\) | \(52\) |
Total | \(57\) | \(95\) | \(127\) | \(163\) | \(158\) | \(400\) |
Is there sufficient evidence to conclude that a student’s anxiety level depends on the pressure to succeed?
\(H_0:\)
\(H_A:\)
Monday | Tuesday | Wednesday | Thursday | |
---|---|---|---|---|
number of absences | 15 | 12 | 9 | 9 |
Suppose there are \(60\) absences in an average week. Test the goodness of fit of this data to a uniform distribution with a significance level of \(.05\).