wide_to_long
Switching data from wide to long is useful and often necessary to analyze or visualize data. The “gather” function from the tidyr package seems to be the most common method.
First, we’ll create some data and load our library
library(tidyr)
## Create some data ##
patient <- c('Randy', 'Anne', 'Sawyer', 'Taylor')
group <- c('testA', 'testA', 'testB', 'testB')
first_measure <- c(22, 24, 20, 20)
second_measure <- c(24, 22, 19, 18)
third_measure <- c(25, 21, 16, 15)
fourth_measure <- c(20, 19, 14, 12)
patient_data <- data.frame(patient, group, first_measure, second_measure, third_measure, fourth_measure)
patient_data
## patient group first_measure second_measure third_measure fourth_measure
## 1 Randy testA 22 24 25 20
## 2 Anne testA 24 22 21 19
## 3 Sawyer testB 20 19 16 14
## 4 Taylor testB 20 18 15 12
Using Gather Use gather to convert the data wide to long. The first two items passed in represent the key/value pair. You can use the third argument to indicate the columsn that will be grouped, or otherwise not gathered:
patient_data %>% gather(key=measure, value=value, -c(patient, group))
## patient group measure value
## 1 Randy testA first_measure 22
## 2 Anne testA first_measure 24
## 3 Sawyer testB first_measure 20
## 4 Taylor testB first_measure 20
## 5 Randy testA second_measure 24
## 6 Anne testA second_measure 22
## 7 Sawyer testB second_measure 19
## 8 Taylor testB second_measure 18
## 9 Randy testA third_measure 25
## 10 Anne testA third_measure 21
## 11 Sawyer testB third_measure 16
## 12 Taylor testB third_measure 15
## 13 Randy testA fourth_measure 20
## 14 Anne testA fourth_measure 19
## 15 Sawyer testB fourth_measure 14
## 16 Taylor testB fourth_measure 12
Using data.table / melt I am a big fan of using data.table. There is also a data.table way to convert wide to long. The fourth and fifth arguments provide the variable and value name.
library(data.table)
## Convert to data table
patient_data <- data.table(patient_data)
# Melt
patient_data %>% melt(patient_data, id=c("patient", "group"), measure=c("first_measure", "second_measure", "third_measure", "fourth_measure"), variable.name="measure", value.name='value')
## patient group measure value
## 1: Randy testA first_measure 22
## 2: Anne testA first_measure 24
## 3: Sawyer testB first_measure 20
## 4: Taylor testB first_measure 20
## 5: Randy testA second_measure 24
## 6: Anne testA second_measure 22
## 7: Sawyer testB second_measure 19
## 8: Taylor testB second_measure 18
## 9: Randy testA third_measure 25
## 10: Anne testA third_measure 21
## 11: Sawyer testB third_measure 16
## 12: Taylor testB third_measure 15
## 13: Randy testA fourth_measure 20
## 14: Anne testA fourth_measure 19
## 15: Sawyer testB fourth_measure 14
## 16: Taylor testB fourth_measure 12