Skip to contents

Includes the following checks:

  • contains required columns (variable_name, short_label, type, choices, origin, status)

  • required columns complete (no missing values)

  • no duplicated values in column variable_name

  • no non-valid values in columns type, origin, status, indirect_identifier

  • for coded-list type variables:

    • no missing choices

    • no incorrectly formatted choices (expected format is "value1, Label 1 | value2, Label 2 | ...")

Usage

valid_dict(dict, verbose = TRUE)

Arguments

dict

A data frame reflecting a data dictionary to validate

verbose

Logical indicating whether to give warning describing the checks that have failed. Defaults to TRUE.

Value

TRUE if all checks pass, FALSE if any checks fail

Examples

# read example dataset
path_data <- system.file("extdata", package = "datadict")
dat <- readxl::read_xlsx(file.path(path_data, "linelist_cleaned.xlsx"))

# generate data dictionary template from dataset
dict <- dict_from_data(dat, factor_values = "string")

# dictionary column 'indirect_identifier' must be manually specified (yes/no)
dict$indirect_identifier <- "no"

# check for validity
valid_dict(dict)
#> [1] TRUE