Unkown Population Stanard deviation

  • There is a more general equation of a t-score but the version specific to out purposes will be the following:

Given a normally distributed population with mean \(\mu\), then for a sample of size \(n\), the \(t\)-scores are:

\[t\text{-score} = \frac{\bar{x}-\mu}{s/\sqrt{n}}\]

and will follow a \(t\)-distribution with degrees of freedom \(n-1\). Said more concisely:

\[\left(\frac{\bar{x}-\mu}{s/\sqrt{n}} \right) \sim t(n-1)\]

Image source: OpenIntro

  1. Find the \(T\)-scores corresponding to middle \(95 \%\) on a t-distribution with \(15\) degrees of freedom.
# to find the z-score corresponding to 95% of the area we use
qnorm(.975)
## [1] 1.959964
# In contrast, to find the t-score corresponding to 95% the area
qt(.975,15)
## [1] 2.13145
  1. Consider a population that is approximately normally distributed with unknown population mean and unknown population standard deviation. A simple random sample of \(16\) observations has a sample mean of \(\bar{X}=62\) and standard deviation of \(52\). Find a \(95 \%\) confidence interval for the population mean.
# Still need normality and independence conditions.
# Independence is satisfied since there is a simple random sample
# Normality: even though n<30, the population is normal so the condition is satisfied.

# collecting known values
xbar <- 62
s <- 52
n <- 16

# find the SE
se <- 52/sqrt(16)

# add and subtract SE*special t-score to sample mean
xbar + se*qt(.975,15)
## [1] 89.70884
xbar - se*qt(.975,15)
## [1] 34.29116
# So, we are 95% confident that the population mean is between 34.29 and 89.7
  1. We will identify a confidence interval for the average mercury content in dolphin muscle using a sample of 19 dolphins from the Taiji area in Japan. The data are summarized below. The minimum and maximum observed values can be used to evaluate whether or not there are clear outliers.
\(n\) \(\bar{x}\) \(s\) min. max
19 4.4 2.3 1.7 9.2

Construct a 95% confidence interval.

# Independence: Sample is assumed to be fine
# Normality: even though n<30, there are no apparent significant outliers.

xbar <-4.4
s <- 2.3
n <- 19
df <- n-1

se <- 2.3/sqrt(19)

xbar + se*qt(.975,df)
## [1] 5.508565
xbar - se*qt(.975,df)
## [1] 3.291435
# We are 95% confident that the population mean mercury content is between 3.29 and 5.51
  1. ${x} = 20, n = 36, s = 3, = .05 $
  • This has a sample size bigger than 30, so it is not listed on a t-table
  • but for large degrees of freedom, like df>30, the t-distribution is very very close to the normal distribution.
# correct way to do it with R
20 + qt(.975,35)* 3/sqrt(36)
## [1] 21.01505
20 - qt(.975,35)* 3/sqrt(36)
## [1] 18.98495
# if you need to use a table, use the z-score instead
20 + qnorm(.975)* 3/sqrt(36)
## [1] 20.97998
20 - qnorm(.975)* 3/sqrt(36)
## [1] 19.02002
  1. ${x} = 50, n = 16, s = 5, =.01 $
50+qt(.995,15)*5/sqrt(16)
## [1] 53.68339
50-qt(.995,15)*5/sqrt(16)
## [1] 46.31661
  1. In each part decide whether the appropriate method for obtaining the confidence interval is with \(z\)-scores, \(t\)-scores, or neither.
  • A random sample of size \(17\) is taken from a population very near normal. The population standard deviation is unknown.

  • A random sample of size \(50\) is taken from a population that is roughly normal but has some outliers. The population standard deviation is known.

  • A random sample of size \(15\) is taken from a population. There are known to be a few outliers but otherwise is a pretty normal population. The population standard deviation is known.