Given a dataset and set of one or more variables of interest (e.g. variables that are potential indirect identifiers), the function returns the minimum observed value of k in the dataset, corresponding to the unique combination of identifying variables with the fewest observations.
Arguments
- x
A data frame
- vars
A character vector containing the name(s) of the variable(s) in
x
to be included in the k-anonymity calculation
Value
The minimum observed value of k in the dataset, corresponding to the unique combination of identifying variables with the fewest observations
Examples
# read example dataset
path_data <- system.file("extdata", package = "datadict")
dat <- readxl::read_xlsx(file.path(path_data, "linelist_cleaned.xlsx"))
# find minimum observed k for potential indirect identifiers gender and age_cat
k_anonymity(dat, vars = c("gender", "age_cat"))
#> [1] 1