Skip to contents

Unbreak values using regex to match the lagging half of the broken value


unbreak_vals(df, regex, ogcol, newcol, sep = " ", slice_groups)



A data frame with one or more values within a variable broken up across two rows.


Regular expression for matching the trailing (lagging) half of the broken values.


Variable to unbreak.


Name of the new variable with the unified values.


Character string to separate the unified values (default is space).


Deprecated. See details and Package News.


A tibble with 'unbroken' values. The variable that originally contained the broken values gets dropped, and the new variable with the unified values is placed as the first column. The slice_groups

argument is now deprecated; the extra rows and the variable with broken values will be dropped.


This function is limited to quite specific cases, but useful when dealing with tables that contain, for example, scientific names broken across two rows. For unwrapping values, see unwrap_cols.


# regex matches strings starting in lowercase (broken species epithets)
unbreak_vals(primates2017_broken, "^[a-z]", scientific_name, sciname_new)
#> # A tibble: 16 × 4
#>    sciname_new                  common_name              red_list_status mass_kg
#>    <chr>                        <chr>                    <chr>           <chr>  
#>  1 Semnopithecus johnii         Nilgiri Langur           VU              11.45  
#>  2 Trachypithecus obscurus      Dusky Langur             NT              7.13   
#>  3 Presbytis sumatra            Black Sumatran Langur    EN              6      
#>  4 Trachypithecus auratus       East Javan Langur        VU              6.25   
#>  5 Trachypithecus delacouri     Delacour's Langur        CR              NA     
#>  6 Trachypithecus leucocephalus White-headed Langur      CR              8      
#>  7 Presbytis comata             Javan Langur             EN              6.7    
#>  8 Macaca pagensis              Pagai Macaque            CR              4.5    
#>  9 Trachypithecus germaini      Germain's Langur         EN              8.83   
#> 10 Macaca munzala               Arunachal Macaque        EN              NA     
#> 11 Macaca mulatta               Rhesus Macaque           LC              9.9    
#> 12 Semnopithecus hector         Terai Sacred Langur      NT              15.2   
#> 13 Hylobates klossii            Kloss's Gibbon           EN              5.8    
#> 14 Nycticebus menagensis        Philippine Slow Loris    VU              0.28   
#> 15 Nycticebus bengalensis       Bengal Slow Loris        VU              1.21   
#> 16 Nomascus concolor            Western Black Crested G… CR              7.71