Today’s Agenda

  • Forming Hypotheses
  • How rare is rare enough

General Strategy:

  • Write down your two hypotheses.

  • The null hypothesis is the hypothesis you would expect from random events

  • Alternate hypothesis: often the thing you seek to show

  • Then you assume that the null hypothesis is TRUE

  • this lets you draw, precisely, the sampling distribution

  • now collect data and see how rare your data is given your assumptions

  • the significance level usually denoted by \(\alpha\) represents the threshold for when data is considered rare or not

  • the area past your observation is called the \(p\)-value

  • if the p-value is smaller than the significance level then we reject the null hypothesis.

  • if the p-value is larger than the significance level we fail to reject the null hypothesis.

  • In this case, your observation is NOT evidence FOR the null hypothesis.

  1. Suppose a baker claims that his cookie diameter is more than \(10\) cm, on average. Several of his customers do not believe him. To persuade his customers that he is right, the baker decides to do a hypothesis test. He bakes \(10\) cookies. The mean diameter of the sample is \(12\) cm. The baker knows from baking hundreds of cookies that the standard deviation for the diameter is \(0.5\) cm. and the distribution of diameters is normal. Perform a hypothesis test with a \(5 \%\) significance level.

\(H_0: \mu \leq 10\) average diameter is less than or equal to 10

\(H_A: \mu>10\) average diameter of cookies is more than 10

mu <- 10
sigma <- .5
n <- 10
barx <- 12

se <- sigma/sqrt(n)

#find the area past 12 on the curve

z <- (12-10)/se
1-pnorm(z)
## [1] 0
1-pnorm(12,10,se)
## [1] 0
# The baker says to their customers, if the average diameter of my cookies was really 10 cm, then this batch of 10 cookies with diameter 12 would happen 0% of the time. So y'all are crazy.

# Since the p-value of 0 is less than the significance level of .05, this data provides statistically significant evidence against the null hypothesis.

# This data provides statistically significant evidence that the average diameter of cookies is more than 10cm.

What if instead, the baker baked a batch of 10 cookies and had an average diameter of 10.2.

z <- (10.2-10)/se
1-pnorm(z)
## [1] 0.1029516
# Since the p-value of .10295 is bigger than the significance level of .05, we fail to reject the null hypothesis.

# This batch of cookies does not provide statistically significant evidence that the bakers average cookie size is more than 10.

# This DOES NOT provide evidence that cookies are less than or equal to 10cm

  1. Some people claim that they can tell the difference between a diet soda and a regular soda in the first sip. A researcher wanting to test this claim randomly sampled \(80\) such people. He then filled \(80\) plain white cups with soda, half diet and half regular through random assignment, and asked each person to take one sip from their cup and identify the soda as diet or regular. \(53\) participants correctly identified the soda. Does this data provide strong evidence that people can tell the difference between regular and diet soda? (when a significance level is not specified use \(\alpha=.05\)).

\(H_0:p =.5\) the proportion of people who correctly identify the soda

\(H_A:p \neq .5\)

p <- .5

phat <- 53/80

se <- sqrt(p*(1-p)/80)

z <- (phat-p)/se
(1-pnorm(z))*2
## [1] 0.003650434
# .0036 percent
# Since our p-value of .0036 is less than the significance level of .05 our data provides statistically significant evidence that people can tell the difference.
# this means we reject the null hypothesis.

  1. It is believed that \(40 \%\) of people pass their driving test on the first attempt. Suppose you think the percentage is greater than \(40 \%\). So, you perform a hypothesis test and sample \(100\) people. Of the sampled people, \(47\) reply that they passed on their first attempt. Set up a hypothesis test and make a conclusion with a \(10 \%\) significance level.

  2. A child is seeing how long they can hold their breathe under water. The child thinks they can hold their breathe for \(150\) seconds on average. The child’s dad thinks it less than that. He samples his daughter holding her breathe eight times and the results are \(144\), \(152\), \(138\), \(144\), \(136\), \(162\), \(158\), and \(142\). Perform a hypothesis test (from the dads perspective) using a \(5 \%\) level of significance. Does the data provide sufficient evidence to reject the null hypothesis?

\(H_0=\)

\(H_A=\)