data myout ;
set braw.accounts end = eof;
fname = "BANK_ACCT";
retain countmissing cm2 0;
if missing(fname) = 0 then countmissing = countmissing+1;
if missing(BANK_ACCT) = 0 then cm2 = cm2 +1;
if eof then output;
keep fname countmissing cm2;
run;
(^ don’t know why that isn’t indenting).
So what I’m wanting to do, is read in the names of variables from the dictionary, and then perform analysis on each of those variables, counting the number missing.
The problem is that when I pass fname into missing() it’s looking for a variable called ‘fname’ not ‘BANK_ACCT’. how can tell it to resolve before passing it in?
There are a few ways to handle this:
1) Construct a macro variable that contains your variable names, then define an array. The previous question you posted would be a good place to start. Something like this:
(Also introduced a new concept – +1; automatically does the retain var 0; bit)
You might find it useful, by the way, to use CALL SYMPUT to put countmissing to a macro variable rather than outputting a row (depending on what you’re using it for).
2) Similar to 1, but you can create an array of all variables, or all numeric variables, or all variables from one point to another point.
3) Operate on a ‘normalized’ dataset, so a dataset with only two variables (or a few) – ‘varname’ and ‘value’. This requires some (significant) work to create, so only useful in certain circumstances.
4) Use some method other than a dataset to figure this out. For exmaple, PROC FREQ or PROC TABULATE can trivially give you the number of missing values of each variable (using the missing option).