there is a statistic from the population we want to know about (often average or proportion) population parameter
we can take a sample of size \(n\), and measure the statistic for the sample, sample statistic
Goal: estimate the population parameter using a sample statistic. (the sample statistic is sometimes referred to as a point estimate)
if we can repeatedly sample from our population at a fixed sample size, the sample statistics will be centered around the population parameter
the central limit theorem says it will be normally distributed
What normal distribution? What is the center? How spread out is it?
Image source: OpenIntro
\[\bar{X} \sim N \left(\mu,\frac{\sigma}{\sqrt{n}} \right)\]
Example: Suppose that the population mean is \(5\) and the population standard deviation is \(12\). What is the probability that a simple random sample of size \(36\) will have a sample mean greater than \(7\)?
z_score <- (7-5)/(12/sqrt(36))
1-pnorm(1)
## [1] 0.1586553
pnorm(1,lower.tail = FALSE)
## [1] 0.1586553
1-pnorm(7,5,12/sqrt(36))
## [1] 0.1586553
# There is a 15.8% chance that if you take a sample of size 36 from a population with mean 5 and standard deviation 12, that the sample mean will be larger than 7.
Activity:
A population has mean \(\mu = 143\) and standard deviation \(\sigma = 15\). Describe the sampling distribution for a sample of size \(150\). Draw a picture of the sampling distribution. Label the area corresponding to the probability of a sample mean greater than \(144\). Find the probability.
A population has mean \(\mu = 22\) and standard deviation \(\sigma = 1.4\). What is the standard error if the sample size is \(50\). How many standard errors away from the population mean is a sample mean of \(\bar{X} = 23\)? Find the probability that a sample mean of size \(50\) has mean less than 23.
A population has mean \(\mu = 22\) and standard deviation \(\sigma = 1.4\). You plan to take a sample of \(50\) observations. Find a the \(2.5\) percentile and the \(97.5\) percentile of the sampling distribution for the sample mean.
qnorm(.975)
## [1] 1.959964
qnorm(.025)
## [1] -1.959964