stata和sas命令对比(2)

2019-03-21 19:22

new; run; Delete observations: data new; set new; if var1 = 1 then delete; run; Loop over a variable list (varlist): data new(drop= i); set new; array raymond {4} var1 var2 var3 var4; do i = 1 to 4; if raymond{i} = 99 then raymond{i} = . ; end; run; Check out this array example in the SAS programming examples page. Create variable labels: label age = \plus bonuses\ Define a format: proc format; value yesno 1 = \Assign the format to a variable: data newer; set newer; format smokes yesno.; run; Drop observations: drop if var1 == 1 foreach i of varlist var1 var2 var3 var4 { replace `i' = . if `i' == 99 } Note: Notice that the quote to the left of the local macro variable i is a left quote ( ` ). The left quote is located at the top of your keyboard next to the ( ! 1 ) key. In this example i is a local macro variable that exists only for the duration of the foreach command so it does not need to be dropped like the variable i in the SAS code. label var age \in years\label var income \ Define a format. These are called \labels\label define yesno 1 \ Assign the value label to a variable: label value smokes yesno Remove formats from a variable: data newer; set newer; ** just do not specify a label value smokes . format **; format smokes ; run; Assign formats defined by SAS to a variable: format interview_date mmddyy8.; Assign formats defined by Stata to a variable: format interview_date %tdNN/DD/YY /* pre Stata 10 the format did not start * with the letter \and did not * need two letters for each part of the date: */ format interview_date %dN/D/Y Note: The letter N in %tdNN/DD/YY stands for \of the month\Specifying Mon in %tdDDMonCCYY uses the three letter abbreviation of the name of the month. So %tdNN/DD/YY displays as \11/06/45\and %tdDDMonCCYY displays as \06Nov1945\ Since the Results window/log file is a mix of both the log and the Output window Stata doesn't need title \of Companies That Got a title statement. Titling can be accomplished with Acquired\ a comment. /* Number of Companies That Got Acquired */ proc sort data = new out = newer; by id; sort id run; proc sort data= sashelp.shoes (keep= region product subsidiary stores sales inventory) out= work.shoes; by region subsidiary product; run; /* fix flaw in dataset * where the Copenhagen subsidiary * has 2 obs for product = \Shoe\**/ proc summary nway data= work.shoes; /* the by statement fixes * the variable order in work.shoes **/ by region subsidiary product; var stores sales inventory; output out= work.shoes (drop= _TYPE_ _FREQ_) sum=stores sales inventory;run; /* long to wide because: * there are repeats of by-variable values **/ proc transpose data= work.shoes out= shoes_wide prefix=prodnum; by region subsidiary; var product; run; keep region subsidiary product bysort region subsidiary (product) : gen prodnum = _n reshape wide product, /// i(region subsidiary) j(prodnum) The xpose command is similar but only works with numeric data. It will turn string variables into missing values. /* wide to long because: * there are no repeats of by-variable values **/ proc transpose data= work.shoes_wide out= shoes_long name=prodnum; by region subsidiary; var prodnum: ; run; // \just names the _j variable prodnum reshape long product, i(region subsidiary) j(prodnum) Check out this reshape example in the Stata code examples page. by id: gen f_num = 1 if _n == 1 by id: gen s_num = 1 if _n == 1 & _N == 1 by id: gen l_num = 1 if _n == _N Stata's _n is equivalent to SAS's _n_ in that it is equal to the observation number; but when inside Using by-groups: data newer; set newer; by a by command _n is equal to 1 for the first id; if first.id = 1 then f_num = 1; if first.id = observation of the by-group, 2 for the second 1 and last.id = 1 then s_num = 1; if last.id = observation of the by-group, etc. 1 then l_num = 1; run; Stata's _N is equal to the number of observations in the dataset except in a by command when it is equal to the total number of observations in the by-group. Count the total number of observations within each ID group, and add that total to each observation: proc summary data= new nway; class id; var age; output out= temp(drop= _type_ _freq_) n= totboys; run; proc sort data= temp; by id; run; proc sort data= new; by id; run; data newer; merge new temp; by id; run; bysort id: egen totboys = count(age) Note: in both SAS and Stata, the count will be the number of observations where the variable being counted has a non-missing value. Here we used the variable age. Create a cumulative/running sum of boys bysort id: gen count = sum(gender == 1 & age <= within each ID group: data new; set newer; 18) by id; retain count 0; if first.id then count = 0; if gender = 1 and age <= 18 then count = count + 1; run; data both; merge in.new(in = a) in.newer(in = b); by id; if a = 1 and b = 1; run; Check out this merge example in the SAS programming examples page. use \sort id /* Starting in Stata 11 you have to specify * what type of merge you are doing nor have. * to have your datasets sorted before the merge. * This is a one-to-one merge: */ merge 1:1 id using \of Stata: merge id using \keep if _merge == 3 Stata automatically creates the variable _merge after a merge. Stata will not merge on another dataset if the variable _merge already exists in one of the datasets. The dataset in memory is the \dataset. The dataset that is being merged on is the \dataset. Unlike SAS, variables shared by the master dataset and the using dataset will not be updated (values overwritten) by the using dataset. Like SAS, the formats, labels, and informats of variables shared by the master dataset and the using dataset will be defined by the master dataset. Remember that the master always wins. Use the update option to overwrite missing data in master file. use \append using Concatenate two datasets / add \/* Starting in Stata 11 observations to a dataset: data both; set you can use append without * having a dataset in.new in.newer; run; already in memory: */ append using \ Sorting datasets in order to prepare them for a merge is only required if you are using a version of Stata prior to Stata 11: Create a local macro variable to represent a filename for Stata to use in temporarily storing a data file on the computer's hard drive if requested to do so later: tempfile company use \ Save the dataset that's currently in memory to a temporary filename in Stata's temp directory. This file will be deleted when Stata is exited just like a dataset in SAS's WORK library: save %use \// pre Stata 11 code: sort id merge id using \ Sort datasets in order to prepare them for a merge: Sort permanently stored datasets and create new, sorted copies in the WORK library: proc sort data = in.company out = work.company; by id; run; proc sort data = in.firm out = work.firm; by id; run; data temp2; merge firm(in = a) company(in = b); by id; run; Stata 11 the data does not need to * be sorted but the type of merge needs to be * specified like in this one-to-one merege: */ merge 1:1 id using \ proc surveymeans; cluster sampunit; strata svyset sampunit [pweight = sampwt], stratum; var age income; weight sampwt; strata(stratum) svy: mean age income run; Analyze a subpopulation by implementing the domain option: proc surveymeans; cluster sampunit; strata stratum; domain female; var age income; weight sampwt; run; Starting in SAS 9: proc surveyfreq; cluster sampunit; strata stratum; tables females*var1*var2; weight sampwt; run; When using proc surveyfreq the domain/subpop variable needs to be included in the tables statement. proc surveyreg; cluster sampunit; strata stratum; model depvar = indvar1 indvar2 indvar3; weight sampwt; run; The surveyreg procedure does not have a way of dealing with subpopulations. Using by or where will not suffice as they will compute incorrect standard errors. Starting in SAS 9: proc surveylogistic; cluster sampunit; strata stratum; model depvar = indvar1 indvar2 indvar3; weight sampwt; run; The surveylogistic procedure does not have a way of dealing with subpopulations. Using by or where will not suffice as they will Analyze a subpopulation by implementing the subpop option: svy: mean age income, subpop(females) Note: options come after a comma ( , ). svyset sampunit [pweight = sampwt], strata(stratum) svy: tab var1 var2, subpop(females) svy: tab var1 , subpop(females) svyset sampunit [pweight = sampwt], strata(stratum) svy: regress depvar indvar1 indvar2 indvar3, /// subpop(females) svyset sampunit [pweight = sampwt], strata(stratum) svy: logit depvar indvar1 indvar2 indvar3, /// subpop(females)

共3页:

stata和sas命令对比(2).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档