Aligning strings with regex.

regex_valign(stringvec, regex_ai, sep_str = "")

Arguments

stringvec

A character vector with one element for each line.

regex_ai

A regular expression matching the position for alignment.

sep_str

Optional character vector that will be inserted at the positions matched by the regular expression.

Value

A character vector with one element for each line, with padding inserted at the matched positions so that elements are vertically aligned across lines.

Details

Written mainly for reading fixed width files, text, or tables parsed from PDFs.

See also

This function is based loosely on textutils::valign().

Examples

guests <- unlist(strsplit(c("6 COAHUILA 20/03/2020 7 COAHUILA 20/03/2020 18 BAJA CALIFORNIA 16/03/2020 109 CDMX 12/03/2020 1230 QUERETARO 21/03/2020"), "\n")) # align at first uppercase word boundary , inserting a separator regex_valign(guests, "\\b(?=[A-Z])", " - ")
#> [1] "6 - COAHUILA 20/03/2020" #> [2] "7 - COAHUILA 20/03/2020" #> [3] "18 - BAJA CALIFORNIA 16/03/2020" #> [4] "109 - CDMX 12/03/2020" #> [5] "1230 - QUERETARO 21/03/2020"
# align dates at end of string regex_valign(guests, "\\b(?=[0-9]{2}[\\/]{1}[0-9]{2}[\\/]{1}[0-9]{4}$)")
#> [1] "6 COAHUILA 20/03/2020" #> [2] "7 COAHUILA 20/03/2020" #> [3] "18 BAJA CALIFORNIA 16/03/2020" #> [4] "109 CDMX 12/03/2020" #> [5] "1230 QUERETARO 21/03/2020"