In my data processing scenario I have some source data paths that have “YY” year format:
/user/${user.name}/${YEAR}/${MONTH}/${DAY}/some dataset name_YY
I know it’s a bad design to have ‘YY’ in the end while YEAR is already present in the path but it’s what it’s in our current systems and we can’t afford to change it for a while.
<datasets>
<dataset name="hourlyds" frequency="${ds_frequency}"
initial-instance="${ds_initial_instance}" timezone="${ds_timezone}">
<uri-template>${baseFsURI}/${YEAR}/${MONTH}/${DAY}/alpha_${coord:formatTime(coord:actualTime(),'yy')}</uri-template>
OR (tried one at a time)
<uri-template>${baseFsURI}/${YEAR}/${MONTH}/${DAY}/alpha_${coord:formatTime(coord:nominalTime(),'yy')}</uri-template>
<done-flag>${doneFlag}</done-flag>
</dataset>
</datasets>
None of the 2 ELs worked. Even one non-nested EL expression without any data name prefix failed:
<uri-template>${baseFsURI}/${YEAR}/${MONTH}/${DAY}/${coord:nominalTime()}</uri-template>
Every time it throws following error:
Error: E1004: Expression language evaluation error [Unable to evaluate :${baseFsURI}/${YEAR}/${MONTH}/${DAY}/${coord:nominalTime()}: ], java.lang.Exception: Unable to evaluate :${baseFsURI}/${YEAR}/${MONTH}/${DAY}/${coord:nominalTime()}:
How do I get this ‘YY’ format in datasets?
Any way other than EL to get it there?
thanks in advance,
rahul
You can’t put coords in the dataset elements, rather they exist in the input-events or output-events to describe a relation to the current timestamp when formatting the URL using your template url
Is the path
/user/${user.name}/${YEAR}/${MONTH}/${DAY}/some dataset name_YYa file or a directory containing the files?If it’s the files themselves then amend your dataset to remove the
some dataset name_YY– Hadoop will interpret the ${DAY} folder input as a directory and use all the files in it as your inputIf they are directories and you’re only processing ${DAY} folder at a time AND the ${DAY} folder only contains one directory (some dataset YY), then you can use a wildcard in your action: