I have two datasets need to merge. The first one is a big dataset

Question

0

Editorial Team

Asked: May 31, 20262026-05-31T14:26:53+00:00 2026-05-31T14:26:53+00:00

I have two datasets need to merge. The first one is a big dataset

0

I have two datasets need to merge.

The first one is a big dataset with studyid and discharg(the date when the patients got their discharge).

The second one is the fewer observations than first one. They have two columns: studyid and call_mad(the date when nurse call the patient after discharge date). Not all discharges get a call from nurse.

The first table is

STUDYID   DISCHARG      

10011   2008-10-29          

10011   2008-11-7           

10011   2008-11-18          

10011   2009-10-17     

10011   2010-1-2       

10011   2010-1-22

The second table is

 STUDYID        CALL_MAD

 10011          2009-10-19
 10011          2010-1-25

The final table I want

STUDYID   DISCHARG      CALL_MAD

10011   2008-10-29          

10011   2008-11-7           

10011   2008-11-18          

10011   2009-10-17      2009-10-19

10011   2010-1-2       

10011   2010-1-22       2010-1-25

Hopefully, it is clear. Thanks in advance.

Jane

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T14:26:55+00:00

I had the same idea as thelatemail, i.e. you first extract the latest DISCHARG date that is < (or possibly <=) each CALL_MAD date, then merge that data back to the original dataset. I think that is the best that can be done with the data structured as it is, although there is potential for this logic to break down (e.g. if the nurse’s call didn’t relate to the latest discharge). Ideally you would want to add the DISCHARG date column to the second table as a secondary key, so that it would be easy to join on STUDYID and DISCHARG date without making any assumptions.

Anyway, here the code I used.

data ds1;
input STUDYID DISCHARG :yymmdd10.;
format DISCHARG yymmdd10.;
datalines;
10011   2008-10-29
10011   2008-11-7
10011   2008-11-18
10011   2009-10-17
10011   2010-1-2
10011   2010-1-22
;
run;

data ds2;
input STUDYID CALL_MAD :yymmdd10.;
format CALL_MAD yymmdd10.;
datalines;
10011   2009-10-19
10011   2010-1-25
;
run;

proc sql;
create table ds3 as select
ds1.*,
ds2.call_mad
from ds1 inner join ds2 on ds1.studyid=ds2.studyid and ds2.call_mad>ds1.discharg
group by ds1.studyid,ds2.call_mad
having ds1.discharg=max(ds1.discharg);

create table want as select
ds1.*,
ds3.call_mad
from ds1 left join ds3 on ds1.studyid=ds3.studyid and ds1.discharg=ds3.discharg;
quit;

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have two datasets need to merge. The first one is a big dataset

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply