Make many header rows into column names
mash_colnames( df, n_name_rows, keep_names = TRUE, sliding_headers = FALSE, sep = "_" )
df | A |
---|---|
n_name_rows | Number of rows at the top of the data to be used to create the new variable (column) names. Must be >= 1. |
keep_names | If TRUE, existing names will be included in building the new variable names. Defaults to TRUE. |
sliding_headers | If TRUE, empty values in the first (topmost) header header row be filled column-wise. Defaults to FALSE. See details. |
sep | Character string to separate the unified values (default is underscore). |
The original data frame, but with new column names and without the top n rows that held the broken up names.
Tables are often shared with the column names broken up across the
first few rows. This function takes the number of rows at the top of a
table that hold the broken up names and whether or not to include the
names, and mashes the values column-wise into a single string for each
column. The keep_names
argument can be helpful for tables we
imported using a skip
argument. If keep_names
is set to FALSE
,
adjust the value of n_name_rows
accordingly.
This function will throw a warning when possible NA
values end up in the
variable names. sliding_headers
can be used for tables with ragged
names in which not every column has a value in the very first row. In these
cases attribution by adjacency is assumed, and when sliding_headers
is set to TRUE
the names in the topmost row are filled row-wise. This can
be useful for tables reporting survey data or experimental designs in an
untidy manner.
babies <- data.frame( stringsAsFactors = FALSE, Baby = c(NA, NA, "Angie", "Yean", "Pierre"), Age = c("in", "months", "11", "9", "7"), Weight = c("kg", NA, "2", "3", "4"), Ward = c(NA, NA, "A", "B", "C") ) # Including the object names mash_colnames(babies, n_name_rows = 2, keep_names = TRUE)#> Baby Age_in_months Weight_kg Ward #> 3 Angie 11 2 A #> 4 Yean 9 3 B #> 5 Pierre 7 4 Cbabies_skip <- data.frame( stringsAsFactors = FALSE, X1 = c("Baby", NA, NA, "Jennie", "Yean", "Pierre"), X2 = c("Age", "in", "months", "11", "9", "7"), X3 = c("Hospital", NA, NA, "A", "B", "A") ) #' # Discarding the automatically-generated names (X1, X2, etc...) mash_colnames(babies_skip, n_name_rows = 3, keep_names = FALSE)#> Baby Age_in_months Hospital #> 4 Jennie 11 A #> 5 Yean 9 B #> 6 Pierre 7 Afish_experiment <- data.frame( stringsAsFactors = FALSE, X1 = c("Sample", NA, "Pacific", "Atlantic", "Freshwater"), X2 = c("Larvae", "Control", "12", "11", "10"), X3 = c(NA, "Low Dose", "11", "12", "8"), X4 = c(NA, "High Dose", "8", "7", "9"), X5 = c("Adult", "Control", "13", "13", "8"), X6 = c(NA, "Low Dose", "13", "12", "7"), X7 = c(NA, "High Dose", "10", "10", "9") ) # Ragged names mash_colnames(fish_experiment, n_name_rows = 2, keep_names = FALSE, sliding_headers = TRUE )#> Sample Larvae_Control Larvae_Low Dose Larvae_High Dose Adult_Control #> 3 Pacific 12 11 8 13 #> 4 Atlantic 11 12 7 13 #> 5 Freshwater 10 8 9 8 #> Adult_Low Dose Adult_High Dose #> 3 13 10 #> 4 12 10 #> 5 7 9