magine one situation, we have bunch of datasets (probably 100 or more than that) now we have to load them in R for our analysis purpose.
So what should be our approach? Will we invest an hour to load one by one, 100 times with read.csv?
To make this manual and time consuming task just 2 minutes of work we have written one small piece of code which will take the path (folder path) as an input where all of our CSV files are being stored and load them in different data frames with the same name as the CSV files.
Here is the code:
path="C:\\wamp\\www\\exportdb\\download" #Path of the CSV files
setwd(path) #Set working directory path
list.tables=list.files(path) #fetch list of file names
for(i in 1:length(list.tables))
name=substr(list.tables[i],1,nchar(list.tables[i])-4) #remove .csv to create the DF name
temp=read.csv(list.tables[i],header = TRUE) #read csv files
temp=temp[, colSums(is.na(temp)) != nrow(temp)] #remove columns with all NA values
assign(name,temp) #Assign the dataframe name
This code is actually using R’s inbuilt function “list.files” to fetch the file names from the path given and read (using read.csv) all of them one by one and load in different data frames. Data frame names have been created by truncating (using SUBSTR) .CSV part from the file names.
- Files should be in CSV format only.
- “Path” folder should not have any other file except our targeted CSV files.
- Input (CSV) files should have first row as column name.
- This code will set the working directory path to our path of target input files, if we want to reset it, we need to do it manually.