read_health_dict
reads into R pre-formatted popdata data dictionaries,
while health_dict_to_spec
converts a health dict to a readr spec.
read_health_dict(path, sheet, ...)
health_dict_to_spec(health_dict, special = NULL)
Path to the xls/xlsx file.
Sheet to read. Either a string (the name of a sheet), or an
integer (the position of the sheet). Ignored if the sheet is specified via
range
. If neither argument specifies the sheet, defaults to the first
sheet.
arguments passed to readxl::read_excel
a data.frame, output of read_health_dict()
a named list of readr column specifications for columns where you want to override the format in the dictionary file
read_health_dict
: A clean data.frame of health data dictionary
health_dict_to_spec
: a named list of readr column specifications
that can be passed on to the col_types
argument of any of the readr
functions, or dat_to_parquet()
and friends.
read_health_dict
: Files are in .xlsx format and therefore require both a path and sheet
argument. The rest of the function is a thin wrapper around reaxl::read_excel
with some formatting taking place.
health_dict_to_spec
converts a
health dict created by read_health_dict
to a readr spec. This will use the
dictionary to create specifications even for date and datetime columns, and allows
overriding the default column specs by using the special
parameter.
dict <- read_health_dict(dipr_example("sample_hlth_dict.csv"))
#> Rows: 12 Columns: 7
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (4): Name Abbrev, Name, Data Type, Data Format
#> dbl (3): Start, Stop, Length
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dict
#> # A tibble: 12 × 7
#> start stop length name data_type data_format col_type
#> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
#> 1 1 1 1 code char NA c
#> 2 2 9 8 date date ccyymmdd c
#> 3 10 14 5 anum number NA d
#> 4 15 16 2 spec number NA d
#> 5 17 22 6 expl_cd char Three 2-digit codes c
#> 6 31 43 13 amt1 NA NA c
#> 7 44 56 13 code1 num NA d
#> 8 57 61 5 code2 char NA c
#> 9 62 63 2 type varchar NA c
#> 10 64 73 10 date2 datetime YYYY-MM-DD HH:MM:SS c
#> 11 74 83 10 studyid NA NA c
#> 12 84 84 1 linefeed NA NA c
health_dict_to_spec(dict, special = list(code1 = readr::col_integer()))
#> $code
#> <collector_character>
#>
#> $date
#> <collector_date>
#>
#> $anum
#> <collector_double>
#>
#> $spec
#> <collector_double>
#>
#> $expl_cd
#> <collector_character>
#>
#> $amt1
#> <collector_character>
#>
#> $code1
#> <collector_integer>
#>
#> $code2
#> <collector_character>
#>
#> $type
#> <collector_character>
#>
#> $date2
#> <collector_datetime>
#>
#> $studyid
#> <collector_character>
#>
#> $linefeed
#> <collector_skip>
#>