I have a very large table with an indexed datetime field. I want to

Question

0

Asked: May 28, 20262026-05-28T17:00:42+00:00 2026-05-28T17:00:42+00:00

I have a very large table with an indexed datetime field. I want to

0

I have a very large table with an indexed datetime field. I want to do by group processing on the dataset by month and only output the last observation in each month.

The problem is that it doesn’t contain a month field so I can’t use something like this:

if last.month then do;
  output;
end;

Is there a way I can achieve this kind of behaviour without having to add a month field in a previous datastep? The table is 50 gig compressed so I want to avoid any unnecessary steps.

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T17:00:43+00:00

You can actually achieve this using ‘by groupformat’ against your original dataset, formatting the datetime field as ‘dtmonyy5.’ As the name implies, this groups by the formatted values instead of the original.

data new1;
set old;
format datetime dtmonyy5.;
by groupformat datetime;
if last.datetime;
run;

Another method is to use Proc Summary, although this can be memory intensive, particularly against large datasets. Here is the code.

proc summary data=old nway;
class datetime;
format datetime dtmonyy5.;
output out=new2 (drop=_:) maxid(datetime(_all_))=;
run;

Just a quick note on the previous answer, the ‘month’ function works against date fields, not datetime, so you would need to add the datepart function to the line.

month = month(datepart(datetime));

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a very large table with an indexed datetime field. I want to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply