The basic join types used in the hmatch
package ("left", "inner",
"anti") are conceptually equivalent to dplyr's
join
types.
For each of the three join types there is also a counterpart prefixed by
"resolve_" ("resolve_left", "resolve_inner", "resolve_anti"). In a resolve
join rows of raw
with matches to multiple rows of ref
are resolved either
to a single best match or no match before the subsequent join type is
implemented. In a resolve join, rows of raw
are never duplicated.
The exact details of match resolution vary somewhat among functions, and are explained within each function's documentation.
Value
- left
return all rows from
raw
, and all columns fromraw
andref
. Rows inraw
with no match inref
will have NA values in the new columns taken fromref
. If there are multiple matches betweenraw
andref
, all combinations of the matches are returned.- inner
return only the rows of
raw
that have matches inref
, and all columns fromraw
andref
. If there are multiple matches betweenraw
andref
, all combinations of the matches are returned.- anti
return all rows from
raw
where there are not matches inref
, keeping just columns fromraw
- resolve_left
similar to "left", except that any row of
raw
that initially has multiple matches toref
is resolved to either a single 'best' match or no match. All rows ofraw
are returned, and rows ofraw
are never duplicated.- resolve_inner
similar to "inner", except that any row of
raw
that initially has multiple matches toref
is resolved to either a single 'best' match or no match. Only the rows ofraw
that can be resolved to a single best match are returned, and rows ofraw
are never duplicated.- resolve_anti
similar to "anti", except that any row of
raw
that initially has multiple matches toref
is considered non-matching (along with rows ofraw
that initially have no matches toref
), and returned as a single row. Rows ofraw
are never duplicated.