Tutorial

Slicing Data by Position (row number)

There are times when it is helpful to have just a slice of your data, sometimes to peruse it as a quality check on your data wrangling. This is when head() (first 6 observations) and tail() (first 6 observations) functions are often used. The slice family of functions similarly slices rows from your dataset by position, but can do a bit more than just 6 rows. Slice functions can pick out any set of N contiguous rows, slice a proportion of rows, and pick out the N (or proportion) of rows with the highest or lowest values for a particular variable.

Exercise 1

Write the R code required to slice the smartpill dataset to give you rows 22 to 95, after you switched to the newer model of smartpill (after study participant 21). Check the number of rows when you are done.

smartpill %>% 
  slice(-- : --)

smartpill %>% 
  slice(22:95)

Exercise 2

Write the R code required to filter the cmv dataset to the rows containing the 10 highest values for time_to_transplant in months.

cmv %>% 
  ----(time_to_transplant, ---)

raa %>% 
  slice_max(time_to_transplant, ---)

raa %>% 
  slice_max(time_to_transplant, n = 10)

Exercise 3

Write the R code required to filter the smartpill dataset to the rows containing the shortest decile of whole gut transit times (wg_time) in hours.

smartpill  %>% 
  --(wg_time, ---)

smartpill  %>% 
  slice_min(wg_time, prop = )

smartpill  %>% 
  slice_min(wg_time, prop = 0.1)

Exercise 4

Write the R code required to filter for the last 35 rows of the supra dataset.

supra %>% 
  ---(---)

supra %>% 
  slice_tail(n=35)