With the help of specific functions offered by R, reading the CSV files into data frames is much easier.
CSV is expanded as Comma, Separated, Values. In this file, the values stored are separated by a comma. This process of storing the data is much easier.
Storing the data in an excel sheet is the most common practice in many companies. In the majority of firms, people are storing data as comma-separated-values (CSV), as the process is easier than creating normal spreadsheets. Later they can use R’s built in packages to read and analyze the data.
Being the most popular and powerful statistical analysis programming language, R offers specific functions to read data into organized data frames from a CSV file.
In this short example, we will see how we can read a CSV file into organized data frames.
The first thing in this process is to getting and setting up the working directory. You need to choose the working path of the CSV file.
Here you can check the default working directory using getwd() function and you can also change the directory using the function setwd().
>getwd() #Shows the default working directory
----> "C:/Users/Dell/Documents"
> setwd("C:\Users\Dell\Documents\R-test data") #to set the new working Directory
> getwd() #you can see the updated working directory
---> "C:/Users/Dell/Documents/R-test data"
After the setting of the working path, you need to import the data set or a CSV file as shown below.
> readfile <- read.csv("testdata.txt")
Execute the above line of code in R studio to get the data frame as shown below.
To check the class of the variable ‘readfile’, execute the below code.
> class(readfile)
---> "data.frame"
In the above image you can see the data frame which includes the information of student names, their ID’s, departments, gender and marks.
After getting the data frame, you can now analyse the data. You can extract particular information from the data frame.
To extract the highest marks scored by students,
>marks <- max(data$Marks.Scored) #this will give you the highest marks
#To extract the details of a student who scored the highest marks,
> data <- read.csv("traindata.csv")
> Marks <- max(data$Marks.Scored)
> retval <- subset(data, Marks.Scored == max(Marks.Scored)) #This will
extract the details of the student who secured highest marks
> View(retval)
To extract the details of the students who are in studying in ‘chemistry’ Dept,
> readfile <- read.csv("traindata.csv")
> retval <- subset( data, Department == "chemistry") # This will extract the student details who are in Biochemistry department
> View(retval)
By this process you can read the csv files in R with the use of read.csv(“ “) function. This tutorial covers how to import the csv file and reading the csv file and extracting some specific information from the data frame.
I used R studio for this project. RStudio offers great features like console, editor, and environment as well. Anyhow you are free to use other editors like Thinn-R, Crimson editor, etc. I hope this tutorial will help you in understanding the reading of CSV files in R and extracting some information from the data frame.
For more read: https://cran.r-project.org/manuals.html
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.
Sign up for Infrastructure as a Newsletter.
Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Thank you for the blog
- Shriganesh Bhide