Input file Layout: 01 to 10 – 10 Digit Acct# 53 to 01 –

Question

0

Asked: May 24, 20262026-05-24T13:52:57+00:00 2026-05-24T13:52:57+00:00

Input file Layout: 01 to 10 – 10 Digit Acct# 53 to 01 –

0

Input file Layout:
01 to 10 – 10 Digit Acct#
53 to 01 – An indicator with values ‘Y’ or ‘N’
71 to 10 – Time stamp
(Rest of the fields are insignificant for this sort)

While sorting the input file by splitting and eliminating duplicates in two ways result in different results. I wanna know why?

Casei: Splitting and Eliminating duplicates in the same step.

SORT FIELDS=(01,10,CH,A,53,01,CH,A)
SUM FIELDS=NONE
OUTFIL FILES=01,                                             
INCLUDE=(53,01,CH,C'Y',AND,71,10,CH,GT,&DATE2(-)),                            
OUTFIL FILES=02,                                             
INCLUDE=(53,01,CH,C'N',AND,71,10,CH,GT,&DATE2(-)),

Caseii: Splitting and eliminating duplicates in two different steps:

STEP:01
SORT FIELDS=(01,10,CH,A,53,01,CH,A)
SUM FIELDS=NONE

STEP:02
SORT FIELDS=COPY
OUTFIL FILES=01,                                             
INCLUDE=(53,01,CH,C'Y',AND,71,10,CH,GT,&DATE2(-)),                            
OUTFIL FILES=02,                                             
INCLUDE=(53,01,CH,C'N',AND,71,10,CH,GT,&DATE2(-)),

These two steps are resulting different output. Do u see any difference between both cases? Please clarify.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T13:52:58+00:00

You are asking to sort on an Account Number (10 characters ascending) then on an Indicator (1 character ascending).
These two fields alone determine the key of the record – Timestamp is not part of the sort key. Consequently if there
are two or more records with the same key they could be placed in any (random) order by the sort. No telling
what order the Timestamp values will appear.

Keeping the above in mind, consider what happens when you have two records with the same key but different
Timestamp values. One of these Timestamp values meets the given INCLUDE criteria and the other one doesn’t.
The SUM FIELDS=NONE parameter is asking to remove duplicates based on the key. It does this by grouping
all of the records with the same key together and then selecting the last one in the group. Since key
does not include the Timestamp the choosen record is essentially a random event. Consequently it is unpredictable
as to whether you get the record that meets the subsequent INCLUDE condition.

There are a couple of ways to fix this:

Add Timestamp to the sort key. This might not work because it may leave multiple records for the same Account Number / Inidcator, that is it may break your duplicate removal requirement
Request a stable sort.

A stable sort causes records having the same sort key to maintain their same relative positions after the sort.
This will preserve the original order of the Timestamp values in your file given the same key. When the removal of duplicates occurs DFSORT will choose the last record from the set of duplicates. This should bring the predicability to the duplicate elimination process you are looking for. Specify
a stable sort by adding an OPTIONS EQUALS control card before the SORT card.

EDIT Comment: …picks the VERY FIRST record

The book I based my original answer on clearly stated the last record in a group of records with the same
key would be selected when SUM=NONE is specified. However, it is always
best to consult the vendors own manuals. IBM’s DFSORT Application Programming Guide only states
that one record with each key will be selected. However,
it also has the following note:

The FIRST operand of ICETOOL’s SELECT operator can be used to perform the same
function as SUM FIELDS=NONE with OPTION EQUALS. Additionally, SELECT’s FIRSTDUP,
ALLDUPS, NODUPS, HIGHER(x), LOWER(y), EQUAL(v), LASTDUP, and LAST operands can be
used to select records based on other criteria related to duplicate and non-duplicate
keys. SELECT’s DISCARD(savedd) operand can be used to save the records discarded by
FIRST, FIRSTDUP, ALLDUPS, NODUPS, HIGHER(x), LOWER(y), EQUAL(v), LASTDUP, or
LAST. See SELECT Operator for complete details on the SELECT operator.

Based on this information I would suggest using ICETOOL’s SELECT operator to select the correct record.

Sorry for the misinformation.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Input file Layout: 01 to 10 – 10 Digit Acct# 53 to 01 –

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply