How to Read Stata Data in R Select Columns
In this mail service, we are going to learn how to read Stata (.dta) files in R statistical surround. Specifically, we volition learn i) how to read .dta files in R using Haven, and two) how to write dataframes to .dta file.
Information Import in R: Reading Stata Files
Now, R is, as we all know, a superb statistical programming environs. When it comes to importing and storing data, we tin store our data in the native .rda format. Nonetheless, if we have a collaborator that uses other statistical software (due east.1000., Stata) and/or that are storing their data in unlike formats (eastward.m., .dta files).
Now, this is when R shows us its brilliance; every bit an R user nosotros can load data from a range of file formats; e.g., SAS (.7bdat), Stata (.dta), Excel (east.g., .xlsx), and CSV (.csv). On this site at that place are other tutorials on how to import data from (some) of these formats:
- How to Import SAS files in R
- Reading and writing SPSS files in R
- How to read, and write, Excel (.xslx) files in R – e.k., multiple sheets
Before we keep and learn how to read Stata files in R, we will answer the questions:
Can R Read Stata .dta Files?
The answer is "aye!, R tin read Stata (.dta) files. This is easy to practice with the Haven package. First, load the packet: library(haven)
. Second, use the read_dta()
function.
How do I open a Stata file in R
To open a Stata file in R you tin can use the read_dta()
function from the library called haven. For example, study_df <- read_dta('study_data.dta')
will open up the Stata file called "study_data.dta" and create a data frame object.
How to Read dta File in R
Now, we are presently fix to answer how to open a Stata file in R? by using like shooting fish in a barrel to follow examples. In R, at that place are many useful packages that make information technology possible for us to open up .dta files. Here, in this tutorial, however, we are going to utilize the package Haven (which is part of the Tidyverse packet).
Install Haven:
First, the library need to be installed. Oasis tin can be installed separately or past installing the Tidyverse packages. First, if we want to only haven we open up R (or RStudio) and type install.packages("oasis")
. If we, on the other hand, desire to install all Tidyverse packages nosotros change "haven" to "tidyverse": install.packages('tidyverse')
.
- Learn how to add a cavalcade to a dataframe in R based on other columns.
The Syntax of read_dta()
In this section, before learning the steps to reading a .dta file, nosotros will have a quick look at the syntax of the read_dta() function. In it'southward simples form here's how to import data from a .dta file in R:
# import dta file in r dataframe <- read_dta('PATH_OR_URL_TO_STATA_FILE')
Code linguistic communication: PHP ( php )
Of grade, the function comes packed with a couple of arguments:
As you can see, the first statement is the file, which should be the path, or URL, to the file. This is evident from the syntax above, as well. In this tutorial, we will take a look at the col_select argument.
How to Read a dta File in R Step-By-Pace
In this department, we are finally going to learn how to import .dta files in R. Hither are the 3 simple steps to read a Stata file in R:
1) Load the haven Library:
First, we are going to load the Oasis bundle: library(oasis)
. Now that we have all the functions of the Haven package in the namespace nosotros can go along to pace 2: finding the .dta file nosotros desire to read.
ii) Find the .dta File
Second, before we can import the Stata file, we need to know where the file is located. In the side by side step nosotros, therefore, create a character variable with the path to the file.
dtafile <- file.path(getwd(), "RScripts", "Data", "FifthDayData.dta")
Lawmaking language: R ( r )
Annotation, in the read_dta() case above, the r-script and the information file is in ii subfolders (i.e., Data is a subfolder of the script). To elaborate, nosotros used the getwd() office to get the electric current working directory (e.grand., "C:/Users/Erik/Documents"). Moreover, the .dta file is located in the subfolder (e.g., "Data") to the "RScripts" binder. Thus, the side by side ii character vectors are indicating where the file is and, finally, nosotros accept the file proper name.
iii) Read the File using read_dta():
Now, we are ready to really import the information, from our .dta file, into R. This is washed using the read_dta()
office, every bit previously mentioned. Here's how to read a dta file in R:
# import dta file in r fifthD.df <- read_dta(dtafile) head(fifthD.df)
Code language: R ( r )
That was it, you have at present read the .dta files into a dataframe. Next, y'all may desire to bear out simple data manipulation e.1000., add empty column to dataframe in R.
How to a Read .dta File in R from a URL
In this section, we are going to learn how to import a Stata file (.dta) from a URL. This is, of class, every bit unproblematic as loading the data from the difficult drive. Naturally, nonetheless, we demand to alter the character variable. Here'due south an example on how to read a dta file from a URL:
url <- "http://www.principlesofeconometrics.com/stata/broiler.dta" data.df <- read_dta(dtafile) head(information.df)
Code linguistic communication: R ( r )
If your data includes datetime, and you desire to dissever time from appointment, check the latest mail service:
- How to Excerpt Time from Datetime in R – with Examples
How to Read Specific Columns from a Stata (.dta) file in R
In this section, of the read Stata files in R tutorial, nosotros are going to learn how to employ read_dta() to load specific columns. This may be useful when we programme to clarify some specific variables from very large datasets.
Reading One Cavalcade from a dta File in R
First, we are going to read only one column. In the code chunk below, nosotros are reading the "pbeef" column. Thus, we are using the col_select statement and utilize a character that is specifying the column we want to read:
information.df <- read_dta(url, col_select="pbeef") caput(data.df))
Code language: R ( r )
Reading Multiple Columns from a dta File in R
Now, if we want to read many columns from the .dta file we'll get-go create a character vectors with the column names:
cols <- c("pbeef")
Code linguistic communication: R ( r )
Finally, we are ready to read the columns. Note, here we use the all_of part:
Code language: R ( r )
data.df <- read_dta(url, col_select=all_of(cols)) head(data.df))
We take at present learned how to read a Stata file in R, the next step might be to inspect the dataframe, visualize the information, and if we have chiselled data we should dummy code them. See the posts on how to create besprinkle plots in R with ggplot2 and how to create dummy variables in R.
How to Save a Stata file
In this section, we will learn how to write a dataframe to a Stata file. First, we will learn how to do some data manipulation on a .dta file we have loaded in R and save it as a new .dta file. Second, nosotros are going to learn how to read an Excel file in R and save it as a Stata file.
Saving a dataframe as a Stata file using write_dta()
In the example beneath, we are first going to load a .dta file using read_dta(). Second, we are going to remove columns in R using dplyr(). Finally, when we accept deleted the columns we don't want, we are going to relieve the dataframe as a .dta file.
library(haven);library(dplyr) ## Dta file: dtafile <- file.path(getwd(), "RScripts", "Data", "FifthDayData.dta") dta.df <- read_dta(dtafile)
Code language: R ( r )
In the code chunk, above, nosotros did not practice anything new (for this postal service). Now, in the next code chunk, we are deleting two columns.
Code language: R ( r )
newdta.df <- select(dta.df, -c(index, Day))
Finally, we are set up to write the dataframe equally a .dta file:
write_dta(newdta.df, file.path(getwd(), "RScripts", "Data", "NewFifthDayData.dta"))
Code linguistic communication: R ( r )
Notation, before saving your dta file you might want to use R to remove duplicate rows and columns from the data frame. This can exist washed either using the functions duplicated() or unique().
Save a CSV file as a Stata File
In this section, we are going to work with another R package, from the tidyverse package; readr. At present, we are going to use the read_csv to read data from a CSV file. Afterwards we have imported the CSV to a dataframe nosotros are going to save it as a .dta file using Haven'due south write_dta() function:
library(readr) csvfile <- file.path(getwd(), "RScripts", "Data", "FirstDayData.csv") data.df <- read_csv(csvfile) View(data.df) ## Saving it every bit a dta write_dta(data.df, file.path(getwd(), "RScripts", "Data", "FirstDayData.dta"))
Code language: R ( r )
Export an Excel file every bit a Stata File
In the last instance, we are going to use read_excel (from the readxl parcel) to import a .xslx file in R. Afterwards nosotros have done that, we will save this Excel file as a Stata file.
library(readxl) xlfile <- file.path(getwd(), "RScripts", "Data", "example_concat.xlsx") data.df <- read_excel(xlfile) write_dta(data.df, file.path(getwd(), "RScripts", "Information", "STATADATA.dta"))</code></pre>
Lawmaking language: R ( r )
Note, all the files we accept read using read_dta, read_stata, read_csv, and read_excel can be found here and a Jupyter notebook can exist found hither.
Summary: Read Stata Files using R
In this post, nosotros have learned how to read Stata files in R. Specifically, we've learned how to load .dta files using the Haven packet. Furthermore, we accept learned how to write R dataframes to Stata files, also equally loading data from Excel and CSV files to save them as .dta files.
tolberttwoult1944.blogspot.com
Source: https://www.marsja.se/how-to-read-and-write-stata-dta-files-in-r-with-haven/
0 Response to "How to Read Stata Data in R Select Columns"
Post a Comment