Filtering Numeric Variables
We will start with the prostate dataset, seen here:
## Rows: 316
## Columns: 20
## $ rbc_age_group <dbl> 3, 3, 3, 2, 2, 3, 3, 1, 1, 2, 2, 1, 1, 1, 3, 1, 1, …
## $ median_rbc_age <dbl> 25, 25, 25, 15, 15, 25, 25, 10, 10, 15, 15, 10, 10,…
## $ age <dbl> 72.1, 73.6, 67.5, 65.8, 63.2, 65.4, 65.5, 67.1, 63.…
## $ aa <dbl> 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, …
## $ fam_hx <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ p_vol <dbl> 54.0, 43.2, 102.7, 46.0, 60.0, 45.9, 42.6, 40.7, 45…
## $ t_vol <dbl> 3, 3, 1, 1, 2, 2, 2, 3, 2, 2, 1, 3, 2, 2, 2, 2, 1, …
## $ t_stage <dbl> 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, NA,…
## $ b_gs <dbl> 3, 2, 3, 1, 2, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, 2, 1, …
## $ bn <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ organ_confined <dbl> 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, …
## $ preop_psa <dbl> 14.08, 10.50, 6.98, 4.40, 21.40, 5.10, 6.03, 8.70, …
## $ preop_therapy <dbl> 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, …
## $ units <dbl> 6, 2, 1, 2, 3, 1, 2, 4, 1, 2, 2, 2, 2, 4, 2, 4, 5, …
## $ s_gs <dbl> 1, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 1, 3, 1, 2, …
## $ any_adj_therapy <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ adj_rad_therapy <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ recurrence <dbl> 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ censor <dbl> 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ time_to_recurrence <dbl> 2.67, 47.63, 14.10, 59.47, 1.23, 74.70, 13.87, 8.37…
Exercise 1
Write the R code required to filter the prostate dataset to rows with a prostate volume (p_vol) greater than or equal to 90:
prostate %>%
filter(-- >= --)
prostate %>%
filter(p_vol >= 90)
Exercise 2
Write the R code required to filter the prostate dataset to rows with a family history (fam_hx) of prostate cancer.
Watch the number of == signs
prostate %>%
select(age, t_vol, fam_hx) %>%
filter(fam_hx --)
prostate %>%
select(age, t_vol, fam_hx) %>%
filter(fam_hx == 1)
Exercise 3
Write the R code required to filter the prostate dataset to rows with a preoperative psa (preop_psa) near 12 (within 1).
prostate %>%
select(age, aa, preop_psa) %>%
filter(preop_psa)
prostate %>%
select(age, aa, preop_psa) %>%
filter(near(preop_psa, 12, tol = 1))
Exercise 4
Write the R code required to filter the prostate dataset to rows with ages with values of 60 or 63 or 69.
prostate %>%
select(age, preop_psa, fam_hx) %>%
filter()
prostate %>%
select(age, preop_psa, fam_hx) %>%
filter(age %in% c(60, 63, 69))
Exercise 5
Write the R code required to filter the prostate dataset to rows with preop_psa between 9 and 11.
prostate %>%
select(age, preop_psa, fam_hx) %>%
filter()
prostate %>%
select(age, preop_psa, fam_hx) %>%
filter(between(preop_psa, 9,11))