I have a csv file containing detailed data, say columns A,B,C,D etc. Columns A and B are categories and C is a time stamp.
I am trying to create a summary file showing one row for each combination of A and B. It should pick the row from the original data where C is the most recent date.
Below is my attempt at solving the problem.
Import-CSV InputData.csv | `
Sort-Object -property @{Expression="ColumnA";Descending=$false}, `
@{Expression="ColumnB";Descending=$false}, `
@{Expression={[DateTime]::ParseExact($_.ColumnC,"dd-MM-yyyy HH:mm:ss",$null)};Descending=$true} | `
Sort-Object ColumnA, ColumnB -unique `
| Export-CSV OutputData.csv -NoTypeInformation
First the file is read, then everything is sorted by all 3 columns, the second Sort-Object call is supposed to then take the first row of each. However, Sort-Object with the -unique switch seems to pick a random row, rather than the first one. Thus this does get one row for each AB combination, but not the one corresponding to most recent C.
Any suggestions for improvements? The data set is very large, so going through the file line by line is awkward, so would prefer a powershell solution.
You should look into
Group-By. I didn’t create a sample CSV (you should provide it 🙂 ) so I haven’t tested this out, but I think it should work:This returns the same columns that was inputted(datetime gets excluded from the output).