A user requested the following:
"I've never read a Stata dta in a loop before. Can you give an example of how this would work? Maybe a use-case as well? Thanks."
There are a number of ways to read in sets of data files into STATA using a loop. Following on Ari's example (see previous post), let's say you have a file with a million lines which is too large for stata and you want to read in a thousand lines at a time, do some stuff to it to make it smaller, then append the smaller data sets together to create your final single analytic file. Here is one way
*** Loop will start at 1000, then increment by 1000
*** until it gets to one million
forvalues high = 1000(1000)1000000 {
local low = `high' - 999 //simple counter
use datafile.dta in `low'/`high', clear
<insert code to cut down size of file>
*** Now create temporary file
if `high' == 1000 {
save temp, replace //only first time through the loop
}
else {
append using temp
save temp, replace
}
}
save finalfile.dta, replace
erase temp
*** You can also use a tempfile
*** and avoid the extra erase statement
Another way is to use the 'if' statement. Lets say you have a large database but only want to look at females in that dataset:
use datafile.dta if gender=="female"
You could also put this into a loop to get certain cuts of data, again the gender example
local sex male female
foreach s of local sex {
use datafile.dta if gender == "`s'", clear
** create two data files
** male_newfile.dta and then female_newfile.dta
save `s'_newfile.dta, replace
}
Back to my bananas...
Sincerely,
primary data primate
Subscribe to:
Post Comments (Atom)
Note that Stata has a `touch` command that you can use with `capture` to avoid the if statements the first time you run a loop. Basically `touch` creates a blank file (optionally with the variables you need in it) so that the subsequent `append` works even if the file hadn't previously existed. The `capture` makes sure that when `touch` fails after the first iteration (because the file already exists) the error just gets ignored. This is also a great example of how solving your own problems can solve other peoples', as a certain simian wrote `touch` about 5 years ago and still gets e-mail from people who find it useful.
ReplyDeleteGreat stuff Data Monkey. Thanks for the follow-up post.
ReplyDeleteI still like your way better though. A blog should be appreciated for its overall beauty
ReplyDeleteBuy Pre Written Essays
Online Writing Services
Accounts Software For Small Business