Hi! With increasing interaction between digital world and humans, it is becoming increasingly vital for people (specially youth) to learn skills that can give them an edge in the competitive digital world.
Most of the time, people from non-technical background finds it really hard to go after programming and handling data with software becomes a great challenge for them. Mostly, individuals with poor resources find it hard to purchase license to use data analysis software.
Hence, this online tutorial is a modest attempt to make you learn basic skills in freely available software ‘R’.
R is an environment that facilitates data handling and storage with a range of programming options allowing compatibility with other languages (e.g. python, css, C/C++).
The set of codes and functions to execute the data handling task come with multiple packages. There are 25 inbuilt packages that facilitate powerful analysis and handling of variety of objects.
You can install R using this link https://cloud.r-project.org according to your operating system.
It provides with a ‘>’ prompt on the interface (terminal) where you can input your commands.
Typing A<-1 and press enter on the > prompt in R will do followings. 1 is assigned to A. Here, A is the name of storage (called object) having a numeric value 1. Everything in R is an object. The assignment operator (->) ‘<’ and ‘-’ sign occur at the same time without space. Also, writing A<-1 is same as 1->A. So you can choose your style of writing the codes. One can also use ‘=’ in place of ‘<-’. I prefer using assignment operator in place of equal sign because equal signs are used as a logical expressions in comparing values, so it is better to avoid confusion and keep their distinct use.
Lets check how A appears in R. Type A and press enter that will give you following.
A<-1
A
## [1] 1
This tells us that there is one element in A having value 1. Likewise, you can assign more values to an object in R.
Let us do simple calculations and understand the process slowly.
A<-1
A
## [1] 1
B<-2
B
## [1] 2
A+B
## [1] 3
C<-A+B
C
## [1] 3
Lets try storing words.
#See if A<-hi works?
#Try this
A<-"hi"
A
## [1] "hi"
B<-"bye"
B
## [1] "bye"
Do you want your storage and commands side by side. Not enjoying the non-GUI interface with limited options?
That is why RStudio is popular among people working with R.
The Rstudio link can guide you through the installation of R and an updated version of Rstudio Desktop using the following link
You can choose how your Rstudio interface will look like by exploring this page https://support.rstudio.com/hc/en-us/articles/200549016-Customizing-the-RStudio-IDE
R is a case sensitive expression language. It recognizes almost all the alphanumeric symbols depending up on the country of use. There are some essentials that everyone should know.
getwd() # know your current working directory
## [1] "/Users/muditsingh/Desktop/Class/learnR"
setwd("/Users/muditsingh/Desktop/Class/learnR") # know your current working directory
The entities that R understands and uses to perform its operations are called objects. The objects can be a letter or combination of letters, storage, matrices, a file, list of files, numbers and so on. R performs its analysis based on the object type. So, it is important to understand how R classifies them. It treats the objects as mode and length.
A<-"hi!"
A
## [1] "hi!"
mode(A)
## [1] "character"
class(A)
## [1] "character"
length(A)
## [1] 1
How mode and class differ?
mode is intrinsic to object and class is a term for technical segregation of the object types. For the simplicity, lets stick to class and mode as an specific object property.
A<-5
A
## [1] 5
B<--4
B
## [1] -4
A<-c("Name","Gender","Place")
class(A)
## [1] "character"
B<-factor(A)#converting to factor with levels
B
## [1] Name Gender Place
## Levels: Gender Name Place
class(B)
## [1] "factor"
length(B)
## [1] 3
A<-1
B<-2
A<B#compare A and B
## [1] TRUE
Why # sign? In R, we use # to insert comments/notes in the coding line.
Other than the four types of classification there are other types such as list, matrix, complex. At the moment, we begin with these for simplicity.
A<-c(10,12)
A
## [1] 10 12
The object ‘A’ looks like [1] 10 12
Now, it tells us that there are two elements. Lets understand storage pattern of the object A while assigning more elements to an object.
A<-c(1,2,3,5,7,9)
A
## [1] 1 2 3 5 7 9
A[2] indicates the Second element in A1
A[1:5] lists first to fifth elements stored in A1
Similarly, lets try storing words.
Boy<-"hi!"
Boy
## [1] "hi!"
Girl<-"I want to learn R"
Girl
## [1] "I want to learn R"
We can practice assigning multiple numeric values and character (e.g. names) to different objects and check the outputs.
Lets practice some simple calculations.
A<-2
B<-4
C<-A+B#addition
D<-A/B#division
E<-A-B#subtraction
F<-A*B#multiplication
G<-c(C,D,E,F)
G
## [1] 6.0 0.5 -2.0 8.0
#Lets try with two elements in each
A<-c(2,4)
B<-c(4,6)
C<-A+B#addition
D<-A/B#division
E<-A-B#subtraction
F<-A*B#multiplication
H<-c(C,D,E,F)
H
## [1] 6.0000000 10.0000000 0.5000000 0.6666667 -2.0000000 -2.0000000 8.0000000
## [8] 24.0000000
prod(1,3,5)#product (multiplication)
## [1] 15
sum(1,2)#addition
## [1] 3
Can we join the two objects ‘A’ and ‘B’ in a meaningful way?
Lets try combining boy and girl statements from above example.
statement<-paste(Boy,Girl)
statement
## [1] "hi! I want to learn R"
What is ‘paste’? It is a function that links together the objects. What are functions? A set of codes that gives the pre-defined operation on the object.One function for one task.
Lets try the following example.
first10<-paste(1:10)
nth <- paste(1:7, c("st", "nd", "rd", rep("th", 4)))
paste(nth, collapse = ",")
## [1] "1 st,2 nd,3 rd,4 th,5 th,6 th,7 th"
#Not looks like as expected? Try paste0
first10<-paste0(1:10)
nth <- paste0(1:7, c("st", "nd", "rd", rep("th", 4)))
paste(nth, collapse = ",")
## [1] "1st,2nd,3rd,4th,5th,6th,7th"
So, how do paste and paste0 differ?
‘paste’ converts the strings as ‘character’ that is why we ended up getting the unexpected result even if we supplied numerical data. Whereas ‘paste0’ treats the input as it is and gives outcome without any alteration in mode of the object. Lets see another example.
week<-c("Mon","Tue","Wed", "Thu", "Fri", "Sat", "Sun")
paste(week, nth, sep = ": ", collapse = "; ")
## [1] "Mon: 1st; Tue: 2nd; Wed: 3rd; Thu: 4th; Fri: 5th; Sat: 6th; Sun: 7th"
#Using 'sep'
paste("1st", "2nd", "3rd", sep = ", ")
## [1] "1st, 2nd, 3rd"
Quiz 1. store your name to an object ‘name’
store your gender to the object ‘gender’
store your city/town/village name to an object ‘location’
store in an object(Hint: paste)
print the object displaying your name, gender and location
Can we assign alphanumeric value to an object (e.g. storing ‘AB12’ to some object)?*
You can now go to next section of the turtorial on data handling.
Happy learning!
please cite this tutorial as
Singh, M.K.(2023).Beginning with R (Rstudio), doi: 10.5281/zenodo.7792344
I acknowledge the beginning of this tutorial with an encouragement from colleagues at IITKanpur.I extend my thanks to colleagues at Duke University for training me in R