Natural Language Processing: Sentiments of All Inauguration Speeches

In this article I analyse all US president inauguration speeches in terms of sentiment, length of sentences, party of the president and length of speech. All these information will be presented in one graph.

  • Objectives: text mining introduction
  • Requirements: R Data-Mining

Introduction

As always we start with loading required packages. We need

  • rJava, qdap for text mining
  • dplyr for data preparation
  • ggplot2, ggrepel for visualisation
  • rio for data import
library(rJava)
suppressPackageStartupMessages(library(qdap))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
library(ggrepel)
library(rio)

Data Import

I prepared matching of president and party and load this file. Next, I load all inauguration speeches, which I before downloaded and saved in individual csv-files.

# Import Data
pres_party <- import("./data/further_files/Presidents_Party.csv")

files <- list.files("./data/")
files <- files[-length(files)]
lng_files <- length(files)

sentences <- data.frame(speech = rep(NA, lng_files),
            year_president = rep(NA, lng_files))

for (i in 1 : lng_files) {
    # import each individual speech
    file_path <- paste0("./data/", files[i])
    temp <- readLines(file_path)
    temp <- paste(temp, collapse = " ")  # concatenate all characters to one string
    sentences$speech[i] <- temp
    sentences$year_president[i] <- strsplit(files[i], split = ".", fixed = T)[[1]][1]
}

With _sentSplit() from qdap package sentences will be splitted. After this step each sentence is represented in a single row.

sentences <- sentSplit(sentences, "speech", verbose = F)

Now, we start with sentiment / polarity analysis. polarity() is a function that provides sentiment analysis. Then we join the resulting dataframe with the information of party.

pol <- polarity(sentences$speech, sentences$year_president)
pol_df <- pol$all
pol_df <- pol_df %>% dplyr::filter(!is.na(year_president))
pol_df$year_president <- as.factor(pol_df$year_president)
pol_df$pos.words <- NULL
pol_df$neg.words <- NULL

pol_group <- pol$group

# get party information
pol_group <- left_join(pol_group, pres_party, by = "year_president")
pol_group$party <- as.factor(pol_group$party)

pol_group$year_president <- gsub("_", " ", pol_group$year_president)

Visualisation of Result

We use ggplot() for visualisation. We add as many information as possible:

  • x axis represents average number of words per sentence
  • y axis is sentiment
  • color is party membership
  • size of circles represents the length of speech

Each year and president is highlighted with geom_text_repel() function.

# Polarity vs. Mean Sentence Length
color_party <- c("blue", "green", "orange", "red", "grey", "brown")
g <- ggplot(pol_group, aes(x =total.words / total.sentences, 
               y = stan.mean.polarity))
g <- g + geom_point(aes(color = party,
            size = total.words/250),
            alpha = .9)
g <- g + geom_text_repel(aes(x =total.words / total.sentences, 
                 y = stan.mean.polarity,
                 label = factor(year_president)))
g <- g + scale_color_manual(values = color_party)
g <- g + xlab ("Mean Words in Sentence [-]")
g <- g + ylab ("Sentiment [-]")
g <- g + ggtitle ("Sentiment, Average Sentence Length, Speech Length and Party of US Inaugurations")
g <- g + theme_bw()
g

plot of chunk unnamed-chunk-5

ggsave(g, filename = "./Inauguration_Speeches.png")

There is a lot to discover, e.g. Trump uses the shortest sentences of all presidents. His speech has a higher numerical sentiment compared to both Obama speeches. His speech is comparably short, but not the shortest speech. The first presidents used the longest sentences. This might mean that sentences got shorter with time. Go and find out for yourself what is interesting for you.

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close