Deep Learning Introduction to neuralnet Package

In this tutorial we will take a look at deep learning with neuralnet package – a package with which you can create neural networks.

Objectives: deep learning with neuralnet package; create a model for mushroom classification
Requirements: R Basics

Data Understanding and Preparation

Data is taken from UCI Machine Learning Repository. It covers 8124 datasets on mushrooms and their properties. Target variable is “edibility”. There are 22 attributes. More details are found on UCI homepage (see link at the end of the article). Our task is to create a model that predicts edibility.

We need to load neuralnet package for deep learning model creation. Data is downloaded from the homepage. It does not include header information, so we need to add this manually.

library(neuralnet)
## 
## Attaching package: 'neuralnet'
## The following object is masked from 'package:ROCR':
## 
##     prediction
## The following object is masked from 'package:dplyr':
## 
##     compute
library(caret)
url <- "http://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/agaricus-lepiota.data"
mushrooms <- read.csv(file = url, header = F)
colnames(mushrooms) <- c("edibility", "cap_shape", "cap_surface", "cap_color", "bruises", "odor", "gill_att", "gill_spacing", "gill_size", "gill_color", "stalk_shape", "stalk_root", "stalk_surf_above", "stalk_surf_below", "stalk_color_above", "stalk_color_below", "veil_type", "veil_color", "ring_nr", "ring_type", "spore_print_color", "population", "habitat")

All data is categorical. We need to transform it to numerical dummy variables, which is done with dummyVars() function from caret package. Attribute “veil_type” has only one factor level, so that it does not contain information. More critically, there is an error in creating “mushrooms_dummy” dataframe, so we need to remove this attribute first. Target variable is added to “mushrooms_dummy” from original “mushroom” dataframe and coded to 0 (poisonous) and 1 (edible).

mushrooms$veil_type <- NULL
dummy_vars <- dummyVars(edibility ~ ., data = mushrooms, fullRank = T)
mushrooms_dummy <- as.data.frame(predict(dummy_vars, newdata = mushrooms))
mushrooms_dummy$edibility <- ifelse(mushrooms$edibility == "e", 1, 0)

Creating the Model

Data is splitted in training and test data. 70 % will be used for training and 30 % for testing. For this we use sample() function for training data index and setdiff() for test data index.

set.seed(2345)
n_mushrooms <- nrow(mushrooms)
train_index <- sample(1 : n_mushrooms, 0.7 * n_mushrooms)
test_index <- setdiff(1:n_mushrooms, train_index)
mushrooms_train <- mushrooms_dummy [train_index, ]
mushrooms_test <- mushrooms_dummy [test_index, ]

neuralnet() requires a formula. We will define this first. Since the “.” operator does not work, we need to add all attributes with paste(). If there are only a few attributes you might type it manually. In our case there are 95 attributes, so it is wise to use some R function for this purpose.

mushrooms_names <- names(mushrooms_dummy)
form <- as.formula(paste("edibility ~", paste(mushrooms_names[!mushrooms_names %in% "edibility"], collapse = "+")))
form
## edibility ~ cap_shape.c + cap_shape.f + cap_shape.k + cap_shape.s + 
##     cap_shape.x + cap_surface.g + cap_surface.s + cap_surface.y + 
##     cap_color.c + cap_color.e + cap_color.g + cap_color.n + cap_color.p + 
##     cap_color.r + cap_color.u + cap_color.w + cap_color.y + bruises.t + 
##     odor.c + odor.f + odor.l + odor.m + odor.n + odor.p + odor.s + 
##     odor.y + gill_att.f + gill_spacing.w + gill_size.n + gill_color.e + 
##     gill_color.g + gill_color.h + gill_color.k + gill_color.n + 
##     gill_color.o + gill_color.p + gill_color.r + gill_color.u + 
##     gill_color.w + gill_color.y + stalk_shape.t + stalk_root.b + 
##     stalk_root.c + stalk_root.e + stalk_root.r + stalk_surf_above.k + 
##     stalk_surf_above.s + stalk_surf_above.y + stalk_surf_below.k + 
##     stalk_surf_below.s + stalk_surf_below.y + stalk_color_above.c + 
##     stalk_color_above.e + stalk_color_above.g + stalk_color_above.n + 
##     stalk_color_above.o + stalk_color_above.p + stalk_color_above.w + 
##     stalk_color_above.y + stalk_color_below.c + stalk_color_below.e + 
##     stalk_color_below.g + stalk_color_below.n + stalk_color_below.o + 
##     stalk_color_below.p + stalk_color_below.w + stalk_color_below.y + 
##     veil_color.o + veil_color.w + veil_color.y + ring_nr.o + 
##     ring_nr.t + ring_type.f + ring_type.l + ring_type.n + ring_type.p + 
##     spore_print_color.h + spore_print_color.k + spore_print_color.n + 
##     spore_print_color.o + spore_print_color.r + spore_print_color.u + 
##     spore_print_color.w + spore_print_color.y + population.c + 
##     population.n + population.s + population.v + population.y + 
##     habitat.g + habitat.l + habitat.m + habitat.p + habitat.u + 
##     habitat.w

Now, everything is in place to create the model. We call neuralnet() function with the formula as parameter. We pass training data. To make use of activation function, we need to set “linear.output” to false. There are many more parameters for tuning the model, but for now we will stick to default parameters.

fit_dl <- neuralnet(formula = form, 
              data = mushrooms_train, 
              linear.output = F)

Model Visualisation

We visualise the model. Since there are many attributes, details are not readable. But we see there is only one hidden layer with one neuron.

plot(fit_dl)

Calculate Results

With compute() predictions are calculated. All attributes, except for “edibility” are passed. Predictions are continuous and are matched to 1 or 0, threshold is set to 0.5. Finally, confusion matrix is shown with table().

train_results <- compute (fit_dl, mushrooms_train[, 1:95])
train_prediction <- train_results$net.result
train_prediction <- ifelse(train_prediction>.5, 1, 0)
table (train_prediction, mushrooms_train$edibility)
##                 
## train_prediction    0    1
##                0 2727    0
##                1    0 2959

There is an accuracy of 100 %. There is no need to further finetune the model parameters. This extremely good result might be related to overfitting, so let’s find out how good the model performs on test data.

test_results <- compute (fit_dl, mushrooms_test[, 1:95])
test_prediction <- test_results$net.result
test_prediction <- ifelse(test_prediction>.5, 1, 0)
table (test_prediction, mushrooms_test$edibility)
##                
## test_prediction    0    1
##               0 1189    0
##               1    0 1249

The result could not be better. Accuracy is 100 % on test data as well.

Summary

We learned the basics of deep learning with neuralnet package. Data is splitted: 70 % is used for training and 30 % for testing. We created a model for prediction of mushroom edibility. A confusion table for training and testing shows perfect model results.

More Information

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close