To begin with, filter out a data frame that only has the Chinstrap penguins in it:
# look at the data
penguins_data <- penguins
# Filter Chinstrap
chinstrap <- filter(penguins, species == "Chinstrap")
Now plot a box plot for the body mass of Chinstrap penguins:
# Body mass boxplot for Chinstrap penguins
ggplot(chinstrap, aes(y = body_mass_g)) + geom_boxplot()
Now compute the quantiles for the body mass of Chinstrap penguins
# quantiles for body mass of Chinstrap penguins
quantile(chinstrap$body_mass_g)
## 0% 25% 50% 75% 100%
## 2700.0 3487.5 3700.0 3950.0 4800.0
Finally, compute the size of the whiskers by doing \(Q_3 + 1.5 \times IQR\) and \(Q_1 - 1.5 \times IQR\)
# compute the size of the whiskers
IQR <- 3950 - 3487.5
IQR(chinstrap$body_mass_g)
## [1] 462.5
3950 + 1.5*IQR
## [1] 4643.75
3487.5 - 1.5*IQR
## [1] 2793.75
ggplot(penguins, aes(y=body_mass_g)) +
geom_boxplot() +
facet_wrap(~species)
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).
ggplot(penguins, aes(y=body_mass_g, color=species)) +
geom_boxplot()
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).
ggplot(penguins, aes(x = body_mass_g)) + geom_histogram(binwidth = 50)
## Warning: Removed 2 rows containing non-finite values (stat_bin).
\[\text{standard deviation} = \sqrt{\text{variance}} = \sqrt{\sigma^2} = \sigma = \sqrt{\displaystyle \frac{\sum_{i=1}^n (\mu-x_i)^2}{N}}\]
\[\text{standard deviation} = \sqrt{\text{variance}} = \sqrt{s^2} = s = \sqrt{\displaystyle \frac{\sum_{i=1}^n (\bar{x}-x_i)^2}{n-1}}\]
Find the a value of body mass for which \(30 \%\) of penguins are below that mass.
Find the a value of body mass for which \(60 \%\) of penguins are above that mass.
Make a box plot for flipper length.
Make a boxplot for body mass where each island is separated into it’s own boxplot.
How many penguins are in the study are not from Torgesen?
How many penguins in the study have a flipper length less than \(190\) and a body mass greater than \(4700\)?