- fundamentals

na_if() substitutes a specified value of a column with NAs.

df %>%
  filter(StockCode == "17011F") %>%
  mutate(New_Country = na_if(Country, "Unspecified"), .keep = "used")
A tibble: 14 x 2

Its functioning relies on matchmaking as, when two vectors share the same values at the same positions, those values become NAs.

na_if(c(1, 2, 3, 4, 5), 
      c(5, 4, 3, 2, 1))
## [1]  1  2 NA  4  5
na_if(c(1, 2, 2, 3), 
      c(3, 2, 2, 1))
## [1]  1 NA NA  3

The documentation states that the second vector is changed to the type of the first, but I was only able to make it work with the easy conversion between TRUE/FALSE and 1/0.

na_if(0:1, c(FALSE, TRUE))
## [1] NA NA
na_if(c(FALSE, TRUE), 0:1)
## [1] NA NA
na_if(as.character(1:5), 1:5)
## Error in `na_if()`:
## ! Can't convert `y` <integer> to match type of `x` <character>.

The example at the start works because the value “Unspecified” is recycled to the size of the filtered df, but we can recycle only vectors of length 1, so if, in a data frame, we want to substitute several values the size of the column and the length of the vector used in na_if() must match.

df %>%
  slice(71:72) %>%
  mutate(New_Country = na_if(Country, c("United Kingdom", "France")), .keep = "used")
A tibble: 2 x 2

na_if() works with columns’ names, not positions.

df %>%
  mutate(Country = na_if(8, "Unspecified"))
Error in `mutate()`:
ℹ In argument: `Country = na_if(8, "Unspecified")`.
Caused by error in `na_if()`:
! Can't convert `y` <character> to match type of `x` <double>.

We can use expressions inside it though.

df %>%
  mutate(Price_Rank = na_if(min_rank(Price), 1)) %>%
  count(Price_Rank) %>%
  arrange(!is.na(Price_Rank))
A tibble: 1606 x 2

- with group_by()

A grouped data frame doesn’t change the functioning of na_if().

df %>%
  group_by(Country) %>%
  mutate(Country = na_if(Country, "Unspecified"), .keep = "used")
A tibble: 525461 x 1
Groups: Country [40]