Today’s Agenda

  • Two Means
  • Two Proportions

Difference of means

\[\bar{X_1} - \bar{X_2} \sim N \left( \mu_1-\mu_2, \sqrt{\frac{(\sigma_1)^2}{n_1} + \frac{(\sigma_2)^2}{n_2}} \right) \]

When the population standard deviation is unknown for both groups, we use the sample standard deviation and the \(t\)-distribution with degrees of freedom equal to the smaller of \(n_1-1\) and \(n_2-1\).

  1. A scientific experiment measured change in blood pressure due to a medication in a control and treatment group. In their measurements negative data indicates a decrease in blood pressure. The control group had an average decrease of \(-1.4\) and the treatment group had an average decrease of \(-4\). With \(9\) people in each group and sample standard deviations \(5.2\) and \(2.4\) in the control and treatment respectively, does this data provide statistically significant evidence of the effectiveness of the medication?

\(H_0: \mu_T = \mu_C\) or \(\mu_T - \mu_C = 0\)

\(H_A: \mu_T \neq \mu_C\) or \(\mu_T-\mu_C \neq 0\)

sT <- 2.4
sC <- 5.2

nT <- 9
nC <- 9

meanT <- -4
meanC <- -1.4

se <- sqrt(sT^2/nT + sC^2/nC)
t <- (meanT-meanC)/se

(pt(t, 8))*2
## [1] 0.210327
# Our p-value is .21, this is larger than the significance level of .05, so we fail to reject the null hypothesis
# Since the T-score of our data is closer to 0 than the cut-off T-score for a 5% significance level, we fail to reject the null.
# our t-score is bigger than -2.3 and smaller than 2.3
# Our data does not provide statistically significant evidence that the medication has any effect on blood pressure.
  1. It is thought that middle school age boys and girls spend an equal time on average watching tv. A study is done for \(25\) randomly selected children. The study had \(16\) boys and \(9\) girls. The \(16\) boys watched tv for an average of \(3.22\) hours per day with a sample standard deviation of \(1\). The \(9\) girls watched an average of two hours of television per day with a sample standard deviation of \(.866\). Does the study suggest a statistically significant difference in the two population means using a significance level of \(.05\)?

\(H_0: \mu_B - \mu_G =0\)

\(H_A: \mu_B - \mu_G \neq 0\)

nB <- 16
nG <- 9

sB <- 1
sG <- .866

se <- sqrt(sB^2/nB + sG^2/nG)

(1.22-0)/se
## [1] 3.194763
# Since our t-score of 3.19 is bigger than the cut-off t-score (critical t-score corresponding to 5% significance), our data provided statistically significant evidence "to reject the null hypothesis" or "that boys and girls watch a different average number of hours of tv"

How is this different than finding a 95% confidence interval?

#sample difference and add and subtract the se times a special t-score

1.22 + qt(.975,8)*se
## [1] 2.100605
1.22 - qt(.975,8)*se
## [1] 0.3393949
# We are 95% confident that the true difference is between .339 and 2.1
# We are 95% confident that boys watch between .33 hours and 2.1 hours more tv than girls on average
  1. \[\hat{p_1} = .34, \ \hat{p_2} = .38, \ n_1 = 52, \ n_2 =65\]

\[H_0:p_1 = p_2\]

\[H_A:p_1 \neq p_2\]