In ancient Rome Caesar used a simple encryption technique for secure communication. Later it was called Caesar cipher. We will implement it in R. For this we need to work with lists and make use of map() function.
Objectives: named lists, use of map() function, modulo operator, string handling; encryption and decryption with Caesar cipher
Requirements: None
Introduction
Caesar cipher is very simple. Each character will be encrypted with another character. So there are two alphabets: a plain and a cipher alphabet, wich are shifted by a defined number. This is the secret key, required to decrypt the message. This technique is not safe, because there are only 26 combinations. So it is easy to find the right one.
Creating the Plain and Cipher Alphabet
We start by loading required packages. We need package purrr for list handling. We also need to load stringr for some string function. Please make sure you have installed it before trying to load it.
First, we create our inputs: “plain_text”, which is the message to be encrypted and an “offset”, which is the secret key and describes the shift between both alphabets.
library(purrr)
library(stringr)
plain_text <- "SmartDataWithR"
offset <- 5
The idea is to have an alphabet, which is a named character vector. Its values are plain letters and its names are cipher letters. How do we do that?
We need an alphabet. Only lower letters are taken into account. This is assigned to “plain_letters”. We need some names, which are the numbers from 1 to 26.
plain_letters <- letters
names(plain_letters) <- 1:26
Next, we do the same for the cipher letters. The only difference its names now include an offset. In our example we have an offset of 5, which means that “a” (first letter) is encrypted to “f” (sixth letter). How is e.g. plain letter “x” handled. It is letter number 24. Shifted by 5 we end up with 29. This is a problem, because for cipher letters we want to have a matching to numbers 1 to 26 as well. So we need some rolling. “x” is shifted 5 times and we end up with “c”, because we started at the beginning of the alphabet again. We can implement this behavior with %% (modulus) operator. We need to correct number 0 and assign it with 26.
cipher_letters <- letters
names(cipher_letters) <- (seq(1, 26, 1) + offset) %% 26
names(cipher_letters)[names(cipher_letters) == 0] <- 26
cipher_letters
## 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ## "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" ## 24 25 26 1 2 3 4 5 ## "s" "t" "u" "v" "w" "x" "y" "z"
We now want to sort this vector by its names.
cipher_letters <- cipher_letters[sort(names(cipher_letters))]
cipher_letters
## 1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 23 24 25 ## "v" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "w" "o" "p" "q" "r" "s" "t" ## 26 3 4 5 6 7 8 9 ## "u" "x" "y" "z" "a" "b" "c" "d"
This is not completely right. After number 1 follows 10, 11, … We need to correct this by adding a leading zero to the numbers.
names(cipher_letters) <- str_pad(as.numeric(names(cipher_letters)), width = 2, pad = "0")
cipher_letters <- cipher_letters[sort(names(cipher_letters))]
cipher_letters
## 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ## "v" "w" "x" "y" "z" "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" ## 19 20 21 22 23 24 25 26 ## "n" "o" "p" "q" "r" "s" "t" "u"
Perfect. This is what we need. Now it is prepared and we only need to set the names of “plain_letters” to these “cipher_letters”.
names(plain_letters) <- cipher_letters
plain_letters
## v w x y z a b c d e f g h i j k l m ## "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" ## n o p q r s t u ## "s" "t" "u" "v" "w" "x" "y" "z"
This is our assignment rule and the core of encryption / decryption process.
Encrypt the Message
At this stage our “plaintext” is one string. We need to transform this to a character vector with _strsplit(). It returns a list, from which we only need the first element [[1]]. Our algorithm won’t distinguish between upper and lower case characters, so all characters are transformed to lower with tolower() function.
plain_single_char <- strsplit(plain_text, "")[[1]]
plain_single_char <- tolower(plain_single_char)
Encryption process starts. We use map_chr() function, which returns characters. Input is “plain_single_char” vector. The applied function assigns cipher text character corresponding to each plain text character.
cipher_single_char <- map_chr(.x = plain_single_char,
.f = ~ plain_letters[names(plain_letters) == .])
cipher_single_char
## [1] "x" "r" "f" "w" "y" "i" "f" "y" "f" "b" "n" "y" "m" "w"
So in our case message s, m, a, r, t, d, a, t, a, w, i, t, h, r is encrypted to x, r, f, w, y, i, f, y, f, b, n, y, m, w.
Decrypt the Message
Decryption applies the same technique. We use assignment rule backwards.
plain_single_char_decrypt <- map_chr(.x = cipher_single_char,
.f = ~ names(plain_letters[plain_letters == .]))
Cipher text v, w, x, y, z, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u is decrypted to s, m, a, r, t, d, a, t, a, w, i, t, h, r. This works pretty well. It is up to you to create some functions for encryption and decryption for repeated use.
More Information
More information on Casar cipher you can find on Wikipedia article.
- Wikipedia Article on Caesar Chiffre https://en.wikipedia.org/wiki/Caesar_cipher