library(palmerpenguins)
mean(penguins$flipper_length_mm, na.rm=TRUE)
## [1] 200.9152
# The mean flipper length is 200.9 mm
The mean tells me some information about where the center of our data is distributed
# find the flipper length for which 20% of penguins have a flipper length less than that value
quantile(penguins$flipper_length_mm,.20,na.rm = TRUE)
## 20%
## 188
# also known as the 20th percentile = 188mm
na.rm = TRUE
over and over, so,
let’s remove the missing data all at once.clean_flipper_length <- na.omit(penguins$flipper_length_mm)
clean_penguins <- na.omit(penguins)
quantile(clean_flipper_length, 0.2)
## 20%
## 188
# wanting the 25th, 50th, and 75th percentile is so common that R has a built in way to do that:
quantile(clean_flipper_length)
## 0% 25% 50% 75% 100%
## 172 190 197 213 231
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
# boxplot for flipper length
ggplot(penguins, aes( y= flipper_length_mm)) + geom_boxplot()
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).
# to get a subset of your data use the filter() function
biscoe_penguins <- filter(penguins, island == "Biscoe")
# all penguins from biscoe whose flipper length is greater than 213
big_flipper_biscoe <- filter(penguins, island == "Biscoe" & flipper_length_mm > 213)
big_flipper_or_biscoe <- filter(penguins, island == "Biscoe" | flipper_length_mm > 213)
other_islands <- filter(penguins, island != "Biscoe")
Find the a value of body mass for which \(30 \%\) of penguins are below that mass.
Find the a value of body mass for which \(60 \%\) of penguins are above that mass.
Make a box plot for flipper length.
Make a boxplot for body mass where each island is separated into it’s own boxplot.
How many penguins are in the study are not from Torgesen?
How many penguins in the study have a flipper length less than \(190\) and a body mass greater than \(4700\)?