Skip to contents

Separate a data frame column containing hierarchical codes into multiple columns, one for each level within the hierarchical code.

Like tidyr::separate except that successive levels are cumulative rather then independent. E.g. the code "canada__ontario__toronto" would be split into three levels:

  1. "canada"

  2. "canada__ontario"

  3. "canada__ontario__toronto"

Usage

separate_hcode(
  x,
  col,
  into,
  sep = "__",
  extra = c("warn", "drop"),
  remove = FALSE
)

Arguments

x

data.frame containing a column with hierarchical codes

col

Name of the column within x containing hierarchical codes.

into

Vector of column names to separate col into

sep

Separator between levels in the hierarchical codes. Defaults to "__".

extra

What to do if a hierarchical code contains more levels than are implied by argument into.

  • "warn" (the default): emit a warning and drop extra values

  • "drop": drop any extra values without a warning

remove

Logical indicating whether to remove col from the output. Defaults to FALSE.

Value

The original data.frame x with additional columns for each level of the hierarchical code

Examples

data(ne_ref)

# generate pcode
ne_ref$pcode <- hcodes_str(ne_ref, pattern = "^adm\\d")

# separate pcode into constituent levels
separate_hcode(
  ne_ref,
  col = "pcode",
  into = c("adm0_pcode", "adm1_pcode", "adm2_pcode")
)
#>    level adm0         adm1         adm2 hcode                           pcode
#> 1   adm0  CAN         <NA>         <NA>   100                             can
#> 2   adm0  USA         <NA>         <NA>   200                             usa
#> 3   adm1  CAN      Ontario         <NA>   110                    can__ontario
#> 4   adm1  USA   New Jersey         <NA>   210                 usa__new_jersey
#> 5   adm1  USA     New York         <NA>   220                   usa__new_york
#> 6   adm1  USA Pennsylvania         <NA>   230               usa__pennsylvania
#> 7   adm2  CAN      Ontario       Durham   111            can__ontario__durham
#> 8   adm2  CAN      Ontario       Halton   112            can__ontario__halton
#> 9   adm2  CAN      Ontario         Peel   113              can__ontario__peel
#> 10  adm2  CAN      Ontario      Toronto   114           can__ontario__toronto
#> 11  adm2  CAN      Ontario         York   115              can__ontario__york
#> 12  adm2  USA   New Jersey       Bergen   211         usa__new_jersey__bergen
#> 13  adm2  USA   New Jersey        Essex   212          usa__new_jersey__essex
#> 14  adm2  USA   New Jersey       Hudson   213         usa__new_jersey__hudson
#> 15  adm2  USA   New Jersey    Middlesex   214      usa__new_jersey__middlesex
#> 16  adm2  USA   New Jersey     Monmouth   215       usa__new_jersey__monmouth
#> 17  adm2  USA     New York    Jefferson   222        usa__new_york__jefferson
#> 18  adm2  USA     New York        Bronx   221            usa__new_york__bronx
#> 19  adm2  USA     New York        Kings   223            usa__new_york__kings
#> 20  adm2  USA     New York       Nassau   224           usa__new_york__nassau
#> 21  adm2  USA     New York     New York   225         usa__new_york__new_york
#> 22  adm2  USA     New York       Queens   226           usa__new_york__queens
#> 23  adm2  USA     New York      Suffolk   227          usa__new_york__suffolk
#> 24  adm2  USA Pennsylvania    Allegheny   231    usa__pennsylvania__allegheny
#> 25  adm2  USA Pennsylvania        Bucks   232        usa__pennsylvania__bucks
#> 26  adm2  USA Pennsylvania      Chester   233      usa__pennsylvania__chester
#> 27  adm2  USA Pennsylvania     Delaware   234     usa__pennsylvania__delaware
#> 28  adm2  USA Pennsylvania    Jefferson   235    usa__pennsylvania__jefferson
#> 29  adm2  USA Pennsylvania    Lancaster   236    usa__pennsylvania__lancaster
#> 30  adm2  USA Pennsylvania Philadelphia   237 usa__pennsylvania__philadelphia
#> 31  adm2  USA Pennsylvania         York   238         usa__pennsylvania__york
#>    adm0_pcode        adm1_pcode                      adm2_pcode
#> 1         can              <NA>                            <NA>
#> 2         usa              <NA>                            <NA>
#> 3         can      can__ontario                            <NA>
#> 4         usa   usa__new_jersey                            <NA>
#> 5         usa     usa__new_york                            <NA>
#> 6         usa usa__pennsylvania                            <NA>
#> 7         can      can__ontario            can__ontario__durham
#> 8         can      can__ontario            can__ontario__halton
#> 9         can      can__ontario              can__ontario__peel
#> 10        can      can__ontario           can__ontario__toronto
#> 11        can      can__ontario              can__ontario__york
#> 12        usa   usa__new_jersey         usa__new_jersey__bergen
#> 13        usa   usa__new_jersey          usa__new_jersey__essex
#> 14        usa   usa__new_jersey         usa__new_jersey__hudson
#> 15        usa   usa__new_jersey      usa__new_jersey__middlesex
#> 16        usa   usa__new_jersey       usa__new_jersey__monmouth
#> 17        usa     usa__new_york        usa__new_york__jefferson
#> 18        usa     usa__new_york            usa__new_york__bronx
#> 19        usa     usa__new_york            usa__new_york__kings
#> 20        usa     usa__new_york           usa__new_york__nassau
#> 21        usa     usa__new_york         usa__new_york__new_york
#> 22        usa     usa__new_york           usa__new_york__queens
#> 23        usa     usa__new_york          usa__new_york__suffolk
#> 24        usa usa__pennsylvania    usa__pennsylvania__allegheny
#> 25        usa usa__pennsylvania        usa__pennsylvania__bucks
#> 26        usa usa__pennsylvania      usa__pennsylvania__chester
#> 27        usa usa__pennsylvania     usa__pennsylvania__delaware
#> 28        usa usa__pennsylvania    usa__pennsylvania__jefferson
#> 29        usa usa__pennsylvania    usa__pennsylvania__lancaster
#> 30        usa usa__pennsylvania usa__pennsylvania__philadelphia
#> 31        usa usa__pennsylvania         usa__pennsylvania__york