Today’s Agenda

  • Packages
  • Tables

Packages

Installing Packages

  • We can install packages in the bottom left panel

  • Once the package is installed it will live in your JupyterHub, ready to be used any time you want

  • install the package palmerpenguins now by typing install.packages("palmerpenguins") in the console

  • inside of this package is a data frame called penguins

  • Type penguins in the console to load it…

  • You should receive a message: “Error: object ‘penguins’ not found”

  • That’s because you’ve only installed the package and have not yet loaded it into your workspace.

Using a Package

  • We don’t always want to load in every package that we’ve ever installed, this could slow down our computations significantly
  • Instead, we always need to load packages in our workspace; use the following code to do so: library(palmerpenguin)
  • Now let’s try to look at the penguins data set again by typing penguins into the console
  • Now let’s add a code chunk to do the same (that way it is in our notes)
library(palmerpenguins)
penguins
## # A tibble: 344 × 8
##    species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
##    <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
##  1 Adelie  Torgersen           39.1          18.7               181        3750
##  2 Adelie  Torgersen           39.5          17.4               186        3800
##  3 Adelie  Torgersen           40.3          18                 195        3250
##  4 Adelie  Torgersen           NA            NA                  NA          NA
##  5 Adelie  Torgersen           36.7          19.3               193        3450
##  6 Adelie  Torgersen           39.3          20.6               190        3650
##  7 Adelie  Torgersen           38.9          17.8               181        3625
##  8 Adelie  Torgersen           39.2          19.6               195        4675
##  9 Adelie  Torgersen           34.1          18.1               193        3475
## 10 Adelie  Torgersen           42            20.2               190        4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <fct>, year <int>
  • If you run the code it works
  • Now try to knit.

Data frame basics

  • The penguin data set exists and we have access to it, but it still is not in our environment.
# putting the penguin data set into our environment (saving it locally)
df_1 <- penguins

You can name a data frame anything you want:

# saving the penguin data frame again but with a different name
penguins_df <- penguins
# it is best practice to use descriptive names for things that you save to your environment
# you can't use spaces and some special characters as your names. Use letters, numbers, and underscores

Accessing particular elements in your data frame is kind of like playing battle ship:

# this code will get the fifth row and third column of the penguins data frame
penguins_df[5,3]
## # A tibble: 1 × 1
##   bill_length_mm
##            <dbl>
## 1           36.7

To extract all of one row or column, leave the other value blank

# this code extracts all the data for the third penguin
penguins_df[3,]
## # A tibble: 1 × 8
##   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
##   <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
## 1 Adelie  Torgersen           40.3            18               195        3250
## # ℹ 2 more variables: sex <fct>, year <int>
# we can also save this to our environment
# variable names can't start with numbers 
# 3a <- penguins_df[3,]
penguin_3 <- penguins_df[3,]

You can obtain a column in the same way, just leave the other value blank. However, often you know the name of a column without knowing it’s position. For this, you can use a special command

# the following code extracts the column with body_mass_g
#penguins_df$body_mass_g
penguin_mass <- penguins_df$body_mass_g
# notice that if you save it, it doesn't show up when you knit
# to display the values call
penguin_mass
##   [1] 3750 3800 3250   NA 3450 3650 3625 4675 3475 4250 3300 3700 3200 3800 4400
##  [16] 3700 3450 4500 3325 4200 3400 3600 3800 3950 3800 3800 3550 3200 3150 3950
##  [31] 3250 3900 3300 3900 3325 4150 3950 3550 3300 4650 3150 3900 3100 4400 3000
##  [46] 4600 3425 2975 3450 4150 3500 4300 3450 4050 2900 3700 3550 3800 2850 3750
##  [61] 3150 4400 3600 4050 2850 3950 3350 4100 3050 4450 3600 3900 3550 4150 3700
##  [76] 4250 3700 3900 3550 4000 3200 4700 3800 4200 3350 3550 3800 3500 3950 3600
##  [91] 3550 4300 3400 4450 3300 4300 3700 4350 2900 4100 3725 4725 3075 4250 2925
## [106] 3550 3750 3900 3175 4775 3825 4600 3200 4275 3900 4075 2900 3775 3350 3325
## [121] 3150 3500 3450 3875 3050 4000 3275 4300 3050 4000 3325 3500 3500 4475 3425
## [136] 3900 3175 3975 3400 4250 3400 3475 3050 3725 3000 3650 4250 3475 3450 3750
## [151] 3700 4000 4500 5700 4450 5700 5400 4550 4800 5200 4400 5150 4650 5550 4650
## [166] 5850 4200 5850 4150 6300 4800 5350 5700 5000 4400 5050 5000 5100 4100 5650
## [181] 4600 5550 5250 4700 5050 6050 5150 5400 4950 5250 4350 5350 3950 5700 4300
## [196] 4750 5550 4900 4200 5400 5100 5300 4850 5300 4400 5000 4900 5050 4300 5000
## [211] 4450 5550 4200 5300 4400 5650 4700 5700 4650 5800 4700 5550 4750 5000 5100
## [226] 5200 4700 5800 4600 6000 4750 5950 4625 5450 4725 5350 4750 5600 4600 5300
## [241] 4875 5550 4950 5400 4750 5650 4850 5200 4925 4875 4625 5250 4850 5600 4975
## [256] 5500 4725 5500 4700 5500 4575 5500 5000 5950 4650 5500 4375 5850 4875 6000
## [271] 4925   NA 4850 5750 5200 5400 3500 3900 3650 3525 3725 3950 3250 3750 4150
## [286] 3700 3800 3775 3700 4050 3575 4050 3300 3700 3450 4400 3600 3400 2900 3800
## [301] 3300 4150 3400 3800 3700 4550 3200 4300 3350 4100 3600 3900 3850 4800 2700
## [316] 4500 3950 3650 3550 3500 3675 4450 3400 4300 3250 3675 3325 3950 3600 4050
## [331] 3350 3450 3250 4050 3800 3525 3950 3650 3650 4000 3400 3775 4100 3775