Introduction
In this tutorial we will write a function that tests if a number is a prime number and find all prime number within 1 and 1000. Recap: a prime number is a number that is greater than one and has no positive divisors except one or itself.
- Objectives: function, control structures, sapply
- Requirements: none
Function for Prime Number Checking
There are several ways to tackle that problem, I will present one.
We will divide a number num by all numbers from 2 to (num-1). Assume we want to check 7, We need to create a vector from 2 to 6 as the denominator.
num <- 7 # for testing
denominator <- 2:(num-1)
The result is a sequence of increasing numbers. The only exception is 2, because then it would be a decreasing vector with (2, 1). So we need a special treatment for 2.
if (num == 2) {
prime <- TRUE
}
Then we will divide the number by its denominator. It is interesting to see what happens here. We divide a number by several numbers (vector) and get as a result a vector, including the result for each of these divisions. We are using modulo-division, which means that only the integer residual is provided.
number_denominator <- num %% denominator
If any of these modulo-division is zero, it can’t be a prime number. Function any is used, which returns a boolean and checks the complete vector. if this condition is fulfilled prime is assigned FALSE.
if (any(number_denominator == 0)) {
prime <- FALSE
}
Let’s put all pieces together and create the function. The function is called IsPrimeNumber. It has only one parameter num and returns a boolean prime, which is either TRUE or FALSE.
IsPrimeNumber <- function(num) {
denominator <- 2:(num-1)
number_denominator <- num %% denominator
if (num == 2) {
prime <- TRUE
} else if (any(number_denominator == 0)) {
prime <- FALSE
} else {
prime <- TRUE
}
return (prime)
}
Check 1 to 1000 for Prime Numbers
We create a dataframe df with column x that includes numbers from 1 to 1000.
df <- data.frame(x = seq(1, 1000, 1))
A new column prime is created that applies our function to each sample. The function to perform such a request is sapply.
df$prime <- sapply(X = df$x, FUN = IsPrimeNumber)
We still have 1000 datasets in df, but we want to keep only prime numbers. So we filter and keep only datasets where prime is TRUE.
df <- df[df$prime, ]
Finally, we count the number of rows with nrow and find out there are 168 prime numbers between 1 and 1000.
nrow(df)
## [1] 168
Bibliography
Prime Numbers https://en.wikipedia.org/wiki/Prime_number
Modulo Operator https://en.wikipedia.org/wiki/Modulo_operation