Day24_Notes : Difference in means

Today’s Agenda

Review
Two Means

Pew Research asked a random sample of \(1000\) American adults whether they supported the increased usage of coal to produce energy. Their sample showed that \(46 \%\) of support increased coal usage. Set up hypotheses to evaluate whether a majority of American adults support or oppose the increased usage of coal.

\(H_0: p = 0.5\)

\(H_A: p \neq 0.5\)

# successes (assuming the null hypothesis is true)
.5*1000

## [1] 500

# failures
.5*1000

## [1] 500

# both are bigger than 10, so the success-failure condition is met

# standardized the observation
se <- sqrt(.5*.5/1000)

z <- (.46 - .5) /se

# since we are doing a two sided test, we will need to double the area in the next step

(pnorm(z))*2

## [1] 0.01141204

# this is our p-value
# that means for our sample, only 1.14% of samples will be as rare or more rare than our sample.

# since our p-value is less than the significance level of .05, we have seen something very rare assuming the null hypothesis is true

# Since the p-value is less than .05, we have statistically significant evidence for rejecting the null hypothesis.

# we have statistically significant evidence that the true proportion of people that support or oppose increased coal usage is different than 50%

You are given the following hypotheses:

\(H_0: \mu = 60\)

\(H_A: \mu \neq 60\)

We know that the population standard deviation is \(8\) and the sample size is \(20\). For what sample mean(s) would the \(p\)-value be equal to \(0.05\)?

z <- qnorm(.975)
se <- 8/sqrt(20)
60 - z*se

## [1] 56.49391

60 + z*se

## [1] 63.50609

# A sample of size 20 would be considered statistically significant evidence against the null hypothesis if the sample mean was less than 56.5 or greater than 63.5

You are given the following hypotheses:

\(H_0: \mu = 60\)

\(H_A: \mu \neq 60\)

We know that the sample standard deviation is \(8\) and the sample size is \(20\). For what sample mean(s) would the \(p\)-value be equal to \(0.05\)?

t <- qt(.975,19)
se <- 8/sqrt(20)
60 - t*se

## [1] 56.25588

60 + t*se

## [1] 63.74412

# A sample of size 20 would be considered statistically significant evidence against the null hypothesis if the sample mean was less than 56.26 or greater than 63.74

Difference of means in groups

We have two different samples and want to measure if there is a statistically significant difference in their means

A scientific experiment measured change in blood pressure due to a medication in a control and treatment group. In their measurements negative data indicates a decrease in blood pressure. The control group had an average decrease of \(-1.4\) and the treatment group had an average decrease of \(-4\). With \(9\) people in each group and sample standard deviations \(5.2\) and \(2.4\) in the control and treatment respectively, does this data provide statistically significant evidence of the effectiveness of the medication?

\(H_0: \mu_T = \mu_C\) or \(\mu_T - \mu_C = 0\)

\(H_A: \mu_T - \mu_C < 0\)

true_diff <- 0
observed_diff <- -4 - (-1.4)

se <- sqrt(5.2^2/9 + 2.4^2/9)

# find the standardized score of the observed difference
t <- (observed_diff - true_diff)/se

# degrees of freedom will be smaller sample size - 1
# in this case, we will use the t-distribution with 8 df.

pt(t,8)

## [1] 0.1051635

Difference of means SE

For difference of means we use:

\[SE = \sqrt{\frac{(\sigma_1)^2}{n_1} + \frac{(\sigma_2)^2}{n_2}}\]