With the craze for “big” data, analytics tools have gained popularity. One of these tools is the programming language R. In this guide, I’ll show how to extract data from text files, CSV files, and databases. Then I’ll show how to send that data to a web server.
You may be wondering, Do I need to learn a new language all over again? The answer is no! All you need to know is a few commands.
Programmers from diverse backgrounds who work on web applications in a variety of programming languages can import the data into R and, after processing, export it in the format they require.
Note: If you’re not familiar with R, I recommend SitePoint’s article on how to install R and RStudio. It provides basic commands in R and a general introduction to the language. This post covers commands that can be run on the R terminal without the use of the RStudio IDE. However, handling large datasets on a terminal could turn out to be difficult for beginners, so I’d suggest using RStudio for an enriched experience. In RStudio, you can run the same commands in the Console box.
Handling Text Files
A text file present on your local machine can be read using a slightly modified read.table
command. Because it’s designed for reading tables, you can set the separator to an empty string (""
) to read a text file line by line:
file_contents = read.table("<path_to_file>", sep = "")
Note: where you see angled brackets such as in <path_to_file>
, insert the necessary number, identifier, etc. without the brackets.
The path to the file may also be the relative path to the file. If your rows have unequal length, you have to set fill = TRUE
as well. The output of this command is a data frame in R.
If your file is too large to be read in one go, you can try reading it in steps using the skip
and nrow
options. For instance, to read the lines 6–10 in your file, run the following commands:
connection <- file("<path_to_file>") lines6_10 = read.table(connection, skip=5, nrow=5) # 6-10 lines
Continue reading How to Import Data and Export Results in R on SitePoint.