I have a log file (file.log) with multiple occurrences of ids i.e. 82244956 in a file.
file.log has been created using the command :
gzip -cd /opt/log.gz | grep "JBOSS1-1" >> ~/file.log
Example :
2012-04-10 09:01:18,196 LOG (7ysdhsdjfhsdhjkwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244956
2012-04-10 09:02:18,196 LOG (24343sdjjkidgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244956
2012-04-10 09:03:18,196 LOG (6744443jfhsdgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244957
2012-04-10 09:04:18,196 LOG (7ysdhsd5677dgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244957
Likewise we have 10000 rows with different ids (but each id repeating 2-3 times. Example top and bottom 2 rows in this example are repeating with id 82244956 and 82244957 respectively). We need result set based on UNIQUE ids (any row from the matched ids)i.e.:
2012-04-10 09:01:18,196 LOG (7ysdhsdjfhsdhjkwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244956
2012-04-10 09:03:18,196 LOG (6744443jfhsdgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244957
I tried to awk program on Linux but not a successful one :
awk ' { arr[$1]=$0 } END { for ( key in arr ) { print arr[key] } } ' file.log >> final-report.log
Or a better way would be to create file.log with distinct ids Only.
Please advise how can I modify it?
$1is the first field, the date. Theidis the last field,$NFinawkparlance. So:This keeps the last record with the given key. To keep the first record, you’d have to do a conditional assignment in the main processing part of the script.