The Chi-Squared statistic can also be used to test how well two distributions fit each other.
The hypotheses are always:
\(H_0\): the distributions are the same
\(H_A\): the distributions are different
\[ \displaystyle \frac{ (\text{row total} \cdot \text{column total})}{\text{table total}}\]
| Dorm | Apartment | Parents | Other | |||
|---|---|---|---|---|---|---|
| Men | \(72\) | \(84\) | \(49\) | \(45\) | ||
| Women | \(91\) | \(86\) | \(88\) | \(35\) |
\(H_0:\)
\(H_A:\)
| Type Accepted | Brown | Columbia | Cornell | |||
|---|---|---|---|---|---|---|
| Regular | \(2115\) | \(1792\) | \(5306\) | |||
| Early Decision | \(577\) | \(627\) | \(1228\) |
At a \(.05\) significance level determine if the distributions are different.
| Monday | Tuesday | Wednesday | Thursday | |
|---|---|---|---|---|
| number of absences | 15 | 12 | 9 | 9 |
Suppose there are \(60\) absences in an average week. Test the goodness of fit of this data to a uniform distribution with a significance level of \(.05\).
| Marital Status | Percent |
|---|---|
| never married | \(31.3\) |
| married | \(56.1\) |
| widowed | \(2.5\) |
| divorced/separated | \(10.1\) |
From a random sample of \(400\) mean ages \(18\) to \(24\), the following data is collected:
| Marital Status | Count |
|---|---|
| never married | \(140\) |
| married | \(238\) |
| widowed | \(2\) |
| divorced/separated | \(20\) |
Perform a goodness of fit test with significance level of \(.05\).
# if you don't input the distribution, it will compare to a uniform distribution
chisq.test(c(140,238,2,20),p=c(.313,.561,.025,.101))
##
## Chi-squared test for given probabilities
##
## data: c(140, 238, 2, 20)
## X-squared = 19.275, df = 3, p-value = 0.0002399
# since the p-value is very small, we reject the null hypothesis. There is statistically significant evidence that the distribution is different than the proposed distribution.
| Number of Televisions | Percent |
|---|---|
| \(0\) | \(10\) |
| \(1\) | \(16\) |
| \(2\) | \(55\) |
| \(3\) | \(11\) |
| \(4+\) | \(8\) |
A random sample of 600 families in North Carolina gave the following results:
| Number of Televisions | Count |
|---|---|
| \(0\) | \(66\) |
| \(1\) | \(119\) |
| \(2\) | \(340\) |
| \(3\) | \(60\) |
| \(4+\) | \(15\) |
At the \(1 \%\) significance level, does it appear that the distribution of number of televisions in North Carolina is different from the distribution for the American population as a whole?
| Salary | No HS diploma | HS | College | Masters | ||
|---|---|---|---|---|---|---|
| \(< \$30,000\) | \(15\) | \(25\) | \(10\) | \(5\) | ||
| \(30\)k\(-40\)k | \(20\) | \(40\) | \(70\) | \(30\) | ||
| \(40\)k\(-50\)k | \(10\) | \(20\) | \(40\) | \(55\) | ||
| \(50\)k\(-60\)k | \(5\) | \(10\) | \(20\) | \(60\) | ||
| \(> \$60,000\) | \(0\) | \(5\) | \(10\) | \(150\) |
\(H_0:\)
\(H_A:\)