已有 556 次阅读 2010-5-8 10:56 |个人分类:Stata|系统分类:科研笔记|关键词:SAS, STATA SAS Stata Most operators are the same in Stata as in SAS, but in Stata operators do not have mnemonic equivalents. For example, you have to use the ampersand ( & ) and not the word \ This works: var_a >= 1 & var_b <= 10 where this does not: var_a >= 1 and var_b <= 10 These are the operators that are different in Stata: In SAS operators can be symbols or Symbol Definition mnemonic equivalents such as: & or and For & and many situations in SAS order doesn't | or matter: <= can be: =< and >= can be: => >= greater than or equal to <= less than or equal to == equality (for equality testing) != does not equal ! not ^ power Note: Symbols have to be in the order shown: \>= \=> \ /* this is a comment */ * this is also a comment // this is a comment as well To continue a command /* this is a comment */ * this is also a to the next line (line continuation): /// you can comment ; comment here as well For example: list id state gender age income /// race income date Range of values: if 1 <= var_a <= 10 or: if var_a in(1,2,3,4,5,6,7,8,9,10) or a list of character values: if state in(\ if var_a >= 1 & var_a <= 10 or: if inrange(var_a,1,10) or: if inlist(var_a,1,2,3,4,5,6,7,8,9,10) or a list of string values: if inlist(state,\ Stata has a limit of 10 arguments to inlist() (which includes the string variable) when the arguments are strings. More than one variable can be specified. Referencing multiple variables at a time: Say the following variables are in a data file Referencing multiple variables at a time: var1-var5 in the order shown: var1 var2 var3 age var4 To Stata, this means \variables that are var5 Then you could code them as: positionally between var1 and var5.\Notice that var1--var5 To SAS, this means \there is only one dash ( - ). that are positionally between var1 and var5,\age. Referencing multiple variables at a time: var1-var5 is the same as: var1 var2 var3 var4 var5 no matter the positions of the variables are in the observation. Using a colon selects variables containing the same prefix: var: could represent: var1 var2 var10 variable varying var_1 Referencing multiple variables at a time: var? The question mark ( ? ) is a wild card that represents one character in the variable name. It could be a number, a letter, or an underscore ( _ ). var* The asterisk/star ( * ) is a wild card that represents many characters in the variable name. They could be numbers, letters, or underscores. Thus: var* could represent: var1 var2 var10 variable varying var_1 To save the contents of the results window, start logging to a log file BEFORE you submit commands that you want logged. Open a log file by clicking on the icon in the tool bar that looks like a scroll and a traffic light. A \*.log\file is a simple ASCII text file; a \*.smcl\tags. You can also use the log command: log using \replace Note: The replace option simply tells Stata to overwrite the log file if it already exists. This is helpful when you have to run a do-file over and over again. To save the contents of the Log window and/or Output window, go to that window and click on the menu bar's \\In SAS batch mode these files are automatically generated for you. libname in \data new; set use \You can also in.mySASfile; run; or, starting in SAS 8: data click on the \file\icon and select your new; set \dataset. run; Save the dataset newer to \: libname in \data in.newer; set new; run; save \To overwrite the dataset newer if it already exists: save \You can also click on the \ proc contents; On selected variables: proc describe On selected variables: describe id state contents data = in.newer (keep= id state gender age income gender age income); run; summarize On selected variables: summarize age proc means; On selected variables: proc income If you want variable labels and a proc means; var age income; run; or proc univariate style output try: summarize age univariate; var age income; run; income, detail or: codebook age income proc freq; table var1; run; tabulate var1 or, for just checking out your dataset, try the codebook command. A series of 1-way tables: proc freq; tables A series of 1-way tables: tab1 var1 var2 var1 var2; run; A 2-way table: proc freq; tables var1*var2; A 2-way table: tab2 var1 var2 run; proc print; selected variables in this order: proc print; var id age income; run; On selected variables and a limited range of observations: proc print data = new (firstobs = 1 obs = 20); var id age income; run; list On selected variables in this order: list id age income On selected variables and a limited range of observations: list id age income in 1/20 Create a numeric variable with a default generate var1 = 1234 Note: the default numeric length of 8 bytes: var1 = 1234; Create a numeric variable with the minimum allowable length (3 bytes): length var1 3; var1 = 1234; data type is \float.\The statement above is relying on that default. It could have been written explicitly as: generate float var1 = 1234 \float\decimal.\ You could more wisely save storage space by specifying: gen int var1 = 1234 \int\ Generate a string variable with a length of 3 bytes: gen str3 name = \ replace var1 = 123456 Stata automatically increases the storage type if necessary. To change the storage of a variable manually, use the recast command. replace name = \Stata automatically increases length to 5 The condition follows the command: replace var2 = 1 if var1 == 123456 Notice that Stata requires two equals signs when testing equality. Create a character variable with a length of 3 bytes: name = \ Increase the variable length to allow for 5 characters: data new; length name $5; set new; *Change the values of numeric * and character variables: *; var1 = 123456; name = \ Example of an if-then statement: if var1 = 123456 then var2 = 1; replace child = 1 if age <= 10 replace parent = 0 if age <= 10 Since each command is executed on all Example of an if-then do loop: if age <= 10 observations before the next command is then do; child = 1; parent = 0; end; executed, the if-then-do loop is not an option. Stata does have excellent looping tools: foreach, forvalues, and while. Example of an if-then-else: if 0 <= age <= 2 then agegp = 1; else if 2 < age <= 10 then agegp = 2; else if 10 < age <= 20 then agegp = 3; else if 20 < age <= 40 then agegp = 4; For the same reason if-then-do loops (above) are not possible in Stata, the same goes for if-then-else. But here is a way of doing the same thing. In this example \missing(agegp)\else agegp = . ; simply highlight the fact that it has not been assigned a value, just like the else does in if-then-else: gen agegp = . replace agegp = 1 if missing(agegp) /// & age >= 0 & age <= 2 replace agegp = 2 if missing(agegp) /// & age > 2 & age <= 10 replace agegp = 3 if missing(agegp) /// & age > 10 & age <= 20 replace agegp = 4 if missing(agegp) /// & age > 20 & age <= 40 The cond() function can also be used: // nest cond() functions gen agegp = cond(missing(age),., /// else cond(age >= 1 & age <= 2 ,1, /// else cond(age > 2 & age <= 10,2, /// else cond(age > 10 & age <= 20,3, /// else cond(age > 20 & age <= 40,4,.))))) Check out this example of cond() in the Stata code examples page. Better done with the recode command which can also create value labels: recode age ( 0/2.9999 = 1 \to 2 year olds\/// ( 3/10.9999 = 2 \to 10 year olds\/// (11/20.9999 = 3 \to 20 year olds\/// (21/40.9999 = 4 \to 40 year olds\/// ( else = . ) , gen(agegp) test The test option checks to see if the ranges overlap. Since recode's ranges are >= and <= , adding .9999 to the upper range ensures that fractional values are handled correctly. Drop variables var1, var2, and var3: drop var1 var2 var3 Keep variables var1, var2, and var3: keep var1 var2 var3 Keep observations keep if var1 == 1 Drop variables var1, var2, and var3: data new(drop= var1 var2 var3); set new; run; Keep variables var1, var2, and var3: data new(keep= var1 var2 var3); set new; run; Keep observations / subsetting if statement: data new; set new; if var1 = 1 then output
stata和sas命令对比
2019-03-21 19:22
stata和sas命令对比.doc
将本文的Word文档下载到电脑
下载失败或者文档不完整,请联系客服人员解决!