Bancova News and Announcements

Pitfalls of SAS merge statement

 
 
Picture of John Zhang
Pitfalls of SAS merge statement
by John Zhang - Wednesday, 9 December 2015, 11:22 AM
 
Suppose you have two SAS datasets:

data ae;
input pt 1-2  ptcode $3-10  pterm $12-23;
cards;
1 10029354 Neutropenia
1 10029354 Neutropenia
1 10029354 Neutropenia
;
run;
proc sort data=ae; by ptcode;

and a nother dataset representing the category for each AE:
 
data cat;
input ptcode $1-8  category $10-28;
cards;
10029354 Special Interest 1
10029354 Special Interest 2
;
run;

you want to merge these datasets so that each AE ptdode will have its corresponding category of interest.   
You might want to try the following code:

data ae2;
	merge ae(in=in1) 
          cat(in=in2);
	by ptcode ;
	if in1 and in2;
 run;

Try to print out, you get the following:

                        Obs    pt     ptcode        pterm            category

                         1      1    10029354    Neutropenia    Special Interest 1
                         2      1    10029354    Neutropenia    Special Interest 2
                         3      1    10029354    Neutropenia    Special Interest 2


You are not happy with the result, then you might want to try the following:

proc sql noprint;
    create table ae3 as select a.*, b.category as category
    from  ae as a,cat as b
    where a.ptcode=b.ptcode;
quit; 

proc print;
run;

                    Obs    pt     ptcode        pterm            category

                         1      1    10029354    Neutropenia    Special Interest 1
                         2      1    10029354    Neutropenia    Special Interest 1
                         3      1    10029354    Neutropenia    Special Interest 1
                         4      1    10029354    Neutropenia    Special Interest 2
                         5      1    10029354    Neutropenia    Special Interest 2
                         6      1    10029354    Neutropenia    Special Interest 2

read the log files when you run the each program and see what you find.

NOTE: MERGE statement has more than one data set with repeats of BY values.

LESSON: revise your code when you see this in your SAS log file: MERGE statement has more than one data set with repeats of BY values.