List Handling

This tutorial provides a basic introduction to lists. It is shown how you can create a list and access its elements. The key concept of applying a function to each element of a list is explained based on purrr package with its map() function.

Introduction

A list is a very flexible data type in R. It can collect elements of different types and lengths. It is a relative of a dataframe.

A dataframe is more restricted. Variables need to have unique names. Each variable is a vector and all variables have the same length.

Creation and Accessing of Elements

A list is created

char_vec <- c("a", "first", "test")
num_vec <- c(1, 2.5, 4, 7)
bool_vec <- c(TRUE, FALSE)
first_list <- list(char_vec, num_vec, bool_vec)
first_list
## [[1]]
## [1] "a"     "first" "test" 
## 
## [[2]]
## [1] 1.0 2.5 4.0 7.0
## 
## [[3]]
## [1]  TRUE FALSE

An element of this list can be accessed with squared brackets. The first item of the list is accessed with:

first_list[1]
## [[1]]
## [1] "a"     "first" "test"

If the third member of this first item should be accessed, double brackets need to be used, follwed by the index in single brackets.

first_list[[1]][3]
## [1] "test"

View a List

Default list view is not very nice looking. For a nicer view package listviewer can be used. Please make sure it is installed before loading it.

library(listviewer)
jsonedit(first_list, mode = "view",width = 640, height = 250)

plot of chunk unnamed-chunk-4

Applying a Function to each Element

Most interesting part of this tutorial is how to apply a function to each element of the list. For this package purrr needs to be loaded. Also, package listviewer is used for creating a nice view. Please make sure you have installed it before loading.

Simple Example

We will create a list with five highest mountains, their names, and their heights. Here, named lists are used with the names “name” and “height”.

library(purrr)
library(listviewer)
Nr1 <- list(name = "Mount Everest",
        height = 8848)
Nr2 <- list(name = "K2",
        height = 8611)
Nr3 <- list(name = "Kangchenjunga",
        height = 8586)
Nr4 <- list(name = "Lhotse",
        height = 8516)
Nr5 <- list(name = "Makalu",
        height = 8485)

mountains <- list(Nr1, Nr2, Nr3, Nr4, Nr5)
jsonedit(mountains, mode = "view", width = 640, height = 250)

plot of chunk unnamed-chunk-5

It is not easy to extract, e.g. all names of the mountains, from this format. You would need some for-loop or lapply. Luckily, this task is very easy with map. As parameters we pass the list mountains. It also may be a column in a dataframe or a vector. Second parameter is a function. In our case the name variable should be shown.

map (.x = mountains, .f = c("name"))
## [[1]]
## [1] "Mount Everest"
## 
## [[2]]
## [1] "K2"
## 
## [[3]]
## [1] "Kangchenjunga"
## 
## [[4]]
## [1] "Lhotse"
## 
## [[5]]
## [1] "Makalu"

The output of map is itself a list. In our example it would be more intuitive to return a character. For this there are several map-function that return a specific type (e.g. map_int for integer, map_df for returning a dataframe). We make use of map_chr for returning a character vector.

map_chr (.x = mountains, .f = c("name"))
## [1] "Mount Everest" "K2"            "Kangchenjunga" "Lhotse"       
## [5] "Makalu"

Advanced Example

More complex functions can be used as well. In this example famous iris dataset is used, which contains information on iris flowers like Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species.

First, we load iris data. Data is piped to the split function. (Piping concept is explained in dplyr tutorial). The results are three different dataframes. Now, map() function gets active. As formula lm() for creation of linear model is passed. Parameters are extracted with summary() function. Finally, coefficients are extracted with coef() function.

iris %>% 
    split(.$Species) %>% 
    map(~lm(formula = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = .)) %>%   map (summary) %>% 
    map(coef)
## $setosa
##               Estimate Std. Error   t value     Pr(>|t|)
## (Intercept)  2.3518898 0.39286751 5.9864707 3.034183e-07
## Sepal.Width  0.6548350 0.09244742 7.0833236 6.834434e-09
## Petal.Length 0.2375602 0.20801921 1.1420107 2.593594e-01
## Petal.Width  0.2521257 0.34686362 0.7268727 4.709870e-01
## 
## $versicolor
##                Estimate Std. Error   t value     Pr(>|t|)
## (Intercept)   1.8955395  0.5070552  3.738329 5.112246e-04
## Sepal.Width   0.3868576  0.2045449  1.891309 6.488965e-02
## Petal.Length  0.9083370  0.1654325  5.490681 1.666695e-06
## Petal.Width  -0.6792238  0.4353821 -1.560064 1.255990e-01
## 
## $virginica
##                Estimate Std. Error    t value     Pr(>|t|)
## (Intercept)   0.6998830 0.53360089  1.3116227 1.961563e-01
## Sepal.Width   0.3303370 0.17432873  1.8949086 6.439972e-02
## Petal.Length  0.9455356 0.09072204 10.4223360 1.074269e-13
## Petal.Width  -0.1697527 0.19807243 -0.8570233 3.958750e-01

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close