Day 18–Confidence Intervals II

Population Proportion Conditions

Independence: Simple random sample
Success-Failure: There are at least 10 success and at least 10 failures.

Insurance companies are interested in knowing the population percent of drivers who always buckle up before riding in a car.

When designing a study to determine this population proportion, what is the minimum number you would need to survey to be \(95 \%\) confident that the population proportion is estimated to within \(0.03\)?

.25/(.03^2/1.96^2)

## [1] 1067.111

# To guarantee an error bound less than .03, we need to sample at least 1067.11 people. ie. need at least 1068 people.

Finding sample size needed for a particular EB

\[n \geq \frac{.25}{\frac{EB^2}{z^2}} = \frac{.25 \cdot z^2}{EB^2}\]

more confidence means we need a bigger sample
allowing more error, means we need a smaller sample

If it were later determined that it was important to be more than \(95 \%\) confident and a new survey was commissioned, how would that affect the minimum number you would need to survey? Explain your reasoning?

We would need to survey more people because of the above formula.

Suppose that the insurance companies did do a survey. They randomly surveyed \(400\) drivers and found that \(320\) claimed they always buckle up. We are interested in the population proportion of drivers who claim they always buckle up.
Explain why the conditions for inference are satisfied.

Independence: randomly surveyed
Success-failure: successes=320>10 and failures=80>10

Construct a \(95\%\) confidence interval and state your results within the context of the problem.

#1. Step 1: Collect known values
alpha <- .05
phat <- 320/400
n <- 400
se <- sqrt(phat*(1-phat)/n)

#2. Add and subtract the error bound to phat

EB <- qnorm(.975)*se
phat + EB

## [1] 0.8391993

phat - EB

## [1] 0.7608007

#3. Interpret results within the context of the problem
# We are 95% confident that the true proportion of drivers that always buckle up is between 76.1% and 83.9%.

Unkown Population Stanard deviation

We have to use \(s\), the sample standard deviation as a substitute.
The sampling distribution of sample means does not follow: \(N(\mu,\sigma/ \sqrt{n})\)
Instead we use the T-distribution.
Recall that we use \(s\) to denote the sample standard deviation.

\[T\text{-score} = \frac{\text{observation}-\text{expected}}{s}\]

Image source: OpenIntro

Consider a population that is approximately normally distributed with unknown population mean and unknown population standard deviation. A simple random sample of \(16\) observations has a sample mean of \(\bar{X}=62\) and standard deviation of \(52\). Find a \(95 \%\) confidence interval for the population mean.

# Still need normality and independence conditions.