In a confidence interval, we have no real idea of what the sampling distribution looks like (where is centered, how spread out is it)
In a hypothesis test, we will want a clear picture of the sampling distribution
Goals: Reformulate our methods to account for
Outline of the process of a hypothesis test: + make two hypotheses about what we expect to happen + one is what you expect due to randomness + the other is meant to notice some trend that would not happen randomly. + Then become a skeptic, assume the thing you want to observe is false + then collect data + if it is rare, considering your assumptions, then maybe your assumption is wrong.
A worker at a company feels that there is gender bias associated to whether someone gets promoted or not. Let \(p\) be the proportion of of people getting promotions that are men:
\(H_0:\) the hypothesis that nothing weird is going on, that everything is just due to randomness is called the null hypothesis
\(H_0: p = 0.5\), half men and half women get promoted
\(H_A:\) the hypothesis that something weird is going, there is bias in promotions, its not just random is called the alternate hypothesis
\(H_A: p > 0.5\) assumes more men are getting promoted
Possible Hypotheses | ||||||
---|---|---|---|---|---|---|
\(H_0\) | \(H_A\) | |||||
\(=\) | \(\neq\), \(>\), or \(<\) | |||||
\(\geq\) | \(<\) | |||||
\(\leq\) | \(>\) |
Promotion Data
Promoted | Not Promoted | Total | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Male | \(21\) | \(3\) | \(24\) | ||||||||
Female | \(14\) | \(10\) | \(24\) | ||||||||
Total | \(35\) | \(13\) | \(48\) |
se <- sqrt(.5*.5/35)
z <- ((21/35)-0.5)/se
# The z-score is 1.183216
1-pnorm(z)
## [1] 0.1183618
# If we could take a sample of size 35 over and over and it is TRUE that 50% of men get promoted, then we would only see more than 21 men get promoted 11.8% of the time
As the experimenter, you need to decide how rare is rare enough. This is called the significance level
Often, we use a significance level of \(\alpha = .05\)
So, if you observe data that is expected to happen less than \(5 %\) of the time, revise your hypothesis.
If this happens we reject the null hypothesis.
Otherwise, we fail to reject the null hypothesis.
Since we observed data expected to happen 11% of the time, and this is bigger than the significance level of 5%, we fail to reject the null hypothesis
The data we collected does not provided statistically significant evidence that the proportion of men getting promotions is greater than 0.5.
\(H_0=\)
\(H_A=\)
\(H_0=\)
\(H_A=\)
# Insert code here
Write your answer here