Population Proportion Conditions

  • Independence: Simple random sample
  • Success-Failure: There are at least 10 success and at least 10 failures.

Insurance companies are interested in knowing the population percent of drivers who always buckle up before riding in a car.

  1. When designing a study to determine this population proportion, what is the minimum number you would need to survey to be \(95 \%\) confident that the population proportion is estimated to within \(0.03\)?
.25/(.03^2/1.96^2)
## [1] 1067.111
# To guarantee an error bound less than .03, we need to sample at least 1067.11 people. ie. need at least 1068 people.

Finding sample size needed for a particular EB

\[n \geq \frac{.25}{\frac{EB^2}{z^2}} = \frac{.25 \cdot z^2}{EB^2}\]

  • more confidence means we need a bigger sample
  • allowing more error, means we need a smaller sample
  1. If it were later determined that it was important to be more than \(95 \%\) confident and a new survey was commissioned, how would that affect the minimum number you would need to survey? Explain your reasoning?

We would need to survey more people because of the above formula.

  1. Suppose that the insurance companies did do a survey. They randomly surveyed \(400\) drivers and found that \(320\) claimed they always buckle up. We are interested in the population proportion of drivers who claim they always buckle up.

  2. Explain why the conditions for inference are satisfied.

  • Independence: randomly surveyed
  • Success-failure: successes=320>10 and failures=80>10
  1. Construct a \(95\%\) confidence interval and state your results within the context of the problem.
#1. Step 1: Collect known values
alpha <- .05
phat <- 320/400
n <- 400
se <- sqrt(phat*(1-phat)/n)

#2. Add and subtract the error bound to phat

EB <- qnorm(.975)*se
phat + EB
## [1] 0.8391993
phat - EB
## [1] 0.7608007
#3. Interpret results within the context of the problem
# We are 95% confident that the true proportion of drivers that always buckle up is between 76.1% and 83.9%.

Unkown Population Stanard deviation

  • We have to use \(s\), the sample standard deviation as a substitute.

  • The sampling distribution of sample means does not follow: \(N(\mu,\sigma/ \sqrt{n})\)

  • Instead we use the T-distribution.

  • Recall that we use \(s\) to denote the sample standard deviation.

\[T\text{-score} = \frac{\text{observation}-\text{expected}}{s}\]

Image source: OpenIntro

  1. Consider a population that is approximately normally distributed with unknown population mean and unknown population standard deviation. A simple random sample of \(16\) observations has a sample mean of \(\bar{X}=62\) and standard deviation of \(52\). Find a \(95 \%\) confidence interval for the population mean.
# Still need normality and independence conditions.