wide_to_long

Switching data from wide to long is useful and often necessary to analyze or visualize data. The “gather” function from the tidyr package seems to be the most common method.

First, we’ll create some data and load our library

library(tidyr)

## Create some data ##

patient <- c('Randy', 'Anne', 'Sawyer', 'Taylor')
group <- c('testA', 'testA', 'testB', 'testB')
first_measure <- c(22, 24, 20, 20)
second_measure <- c(24, 22, 19, 18)
third_measure <- c(25, 21, 16, 15)
fourth_measure <- c(20, 19, 14, 12)

patient_data <- data.frame(patient, group, first_measure, second_measure, third_measure, fourth_measure)


patient_data
##   patient group first_measure second_measure third_measure fourth_measure
## 1   Randy testA            22             24            25             20
## 2    Anne testA            24             22            21             19
## 3  Sawyer testB            20             19            16             14
## 4  Taylor testB            20             18            15             12

Using Gather Use gather to convert the data wide to long. The first two items passed in represent the key/value pair. You can use the third argument to indicate the columsn that will be grouped, or otherwise not gathered:

patient_data %>% gather(key=measure, value=value, -c(patient, group))
##    patient group        measure value
## 1    Randy testA  first_measure    22
## 2     Anne testA  first_measure    24
## 3   Sawyer testB  first_measure    20
## 4   Taylor testB  first_measure    20
## 5    Randy testA second_measure    24
## 6     Anne testA second_measure    22
## 7   Sawyer testB second_measure    19
## 8   Taylor testB second_measure    18
## 9    Randy testA  third_measure    25
## 10    Anne testA  third_measure    21
## 11  Sawyer testB  third_measure    16
## 12  Taylor testB  third_measure    15
## 13   Randy testA fourth_measure    20
## 14    Anne testA fourth_measure    19
## 15  Sawyer testB fourth_measure    14
## 16  Taylor testB fourth_measure    12

Using data.table / melt I am a big fan of using data.table. There is also a data.table way to convert wide to long. The fourth and fifth arguments provide the variable and value name.

library(data.table)
## Convert to data table
patient_data <- data.table(patient_data)

# Melt
patient_data %>% melt(patient_data, id=c("patient", "group"), measure=c("first_measure", "second_measure", "third_measure", "fourth_measure"), variable.name="measure", value.name='value')
##     patient group        measure value
##  1:   Randy testA  first_measure    22
##  2:    Anne testA  first_measure    24
##  3:  Sawyer testB  first_measure    20
##  4:  Taylor testB  first_measure    20
##  5:   Randy testA second_measure    24
##  6:    Anne testA second_measure    22
##  7:  Sawyer testB second_measure    19
##  8:  Taylor testB second_measure    18
##  9:   Randy testA  third_measure    25
## 10:    Anne testA  third_measure    21
## 11:  Sawyer testB  third_measure    16
## 12:  Taylor testB  third_measure    15
## 13:   Randy testA fourth_measure    20
## 14:    Anne testA fourth_measure    19
## 15:  Sawyer testB fourth_measure    14
## 16:  Taylor testB fourth_measure    12