R Data Types
Overview
Teaching: 10 min
Exercises: 5 minQuestions
What types of data does the R language have?
Objectives
Learn the types of data that we can manage in R.
Types of data
We already used numbers to generate a result. But this is not the only type of data that RStudio
can manage. We can use the command typeof()
to corroborate the data type of our object addition
:
> typeof(addition)
> [1] "double"
There are five types of data in RStudio:
- Double
- Integer
- Complex
- Logical
- Character
> typeof(5L) #Integer type can contain only whole numbers followed by a capital L
[1] "integer"
> typeof(72+5i)
[1] "complex"
> addition == subtraction
[1] FALSE
> typeof(addition == subtraction)
[1] "logical"
> result <- "4 and 3 are not the same on Earth. On Mars maybe... "
> typeof(result)
[1] "character"
No matter how complicated our analysis can be, all data in R will belong to one of these five data types. Data types are important because we want to know “who is who, and what is what”. But this concept will help us learn one of the most powerful tools in R, which is the manipulation of diverse types of data together in what is called a data-frame.
Data structures
Besides the data types, there are different ways of organizing the data in R called data structures. The simple data structure is the vector, which is a sequence of data of the same type. We can create a vector with the function c()
.
> char_vector <- c("a", "a", "b", "b", "c", "c")
> typeof(char_vector)
[1] "character"
A more complex data structure is the factor, which holds the names of categories (called levels) and a sequence of the occurrences of those categories. Here we can see how factor works:
> char_factor <- as.factor(char_vector)
> char_factor
[1] a a b b c c
Levels: a b c
And here, we can ask for the structure of the object.
> str(char_factor)
Factor w/ 3 levels "a","b","c": 1 1 2 2 3 3
Here, you see the levels of the factors and a sequence of numbers. Each number represents a level, and this sequence holds the information about which goes in which position. That is why we will get an “integer” if we ask about the type of data in the object.
> typeof(char_factor)
[1] "integer"
When we are dealing with categorical data, factors are the way to go.
Exercise 1: Types and structures of data
Which type of data is present in each of the next vectors?:
ej1 <- c("34","147","26+7i")
ej2 <- c(4L,12L,152L)
ej3 <- c(34,147,26+7i)
A) numerical, integer, numerical
B) character, numerical, numerical
C) numerical, complex, character
D) character, integer, complex
がんばって! (ganbatte; good luck):
Solution
The correct answer is: D) character, integer, complex.
If we use
""
to define an object, R will read it as a character regardless we are typing numbers.
The capitalL
after a number, indicates R to save that number as an Integer.
And each set of numbers with the letteri
of imaginary, indicates R that this is a set of complex numbers.
Key Points
R uses different types of data to store information.