\[\bar{X_1} - \bar{X_2} \sim N \left( \mu_1-\mu_2, \sqrt{\frac{(\sigma_1)^2}{n_1} + \frac{(\sigma_2)^2}{n_2}} \right) \]
When the population standard deviation is unknown for both groups, we use the sample standard deviation and the \(t\)-distribution with degrees of freedom equal to the smaller of \(n_1-1\) and \(n_2-1\).
\(H_0: \mu_T = \mu_C\) or \(\mu_T - \mu_C = 0\)
\(H_A: \mu_T \neq \mu_C\) or \(\mu_T-\mu_C \neq 0\)
sT <- 2.4
sC <- 5.2
nT <- 9
nC <- 9
meanT <- -4
meanC <- -1.4
se <- sqrt(sT^2/nT + sC^2/nC)
t <- (meanT-meanC)/se
(pt(t, 8))*2
## [1] 0.210327
# Our p-value is .21, this is larger than the significance level of .05, so we fail to reject the null hypothesis
# Since the T-score of our data is closer to 0 than the cut-off T-score for a 5% significance level, we fail to reject the null.
# our t-score is bigger than -2.3 and smaller than 2.3
# Our data does not provide statistically significant evidence that the medication has any effect on blood pressure.
\(H_0: \mu_B - \mu_G =0\)
\(H_A: \mu_B - \mu_G \neq 0\)
nB <- 16
nG <- 9
sB <- 1
sG <- .866
se <- sqrt(sB^2/nB + sG^2/nG)
(1.22-0)/se
## [1] 3.194763
# Since our t-score of 3.19 is bigger than the cut-off t-score (critical t-score corresponding to 5% significance), our data provided statistically significant evidence "to reject the null hypothesis" or "that boys and girls watch a different average number of hours of tv"
How is this different than finding a 95% confidence interval?
#sample difference and add and subtract the se times a special t-score
1.22 + qt(.975,8)*se
## [1] 2.100605
1.22 - qt(.975,8)*se
## [1] 0.3393949
# We are 95% confident that the true difference is between .339 and 2.1
# We are 95% confident that boys watch between .33 hours and 2.1 hours more tv than girls on average
\[H_0:p_1 = p_2\]
\[H_A:p_1 \neq p_2\]