I have a file that has columns that look like this:
Column1,Column2,Column3,Column4,Column5,Column6
1,2,3,4,5,6
1,2,3,4,5,6
1,2,3,4,5,6
1,2,3,4,5,6
1,2,3,4,5,6
1,2,3,4,5,6
Column1,Column3,Column2,Column6,Column5,Column4
1,3,2,6,5,4
1,3,2,6,5,4
1,3,2,6,5,4
Column2,Column3,Column4,Column5,Column6,Column1
2,3,4,5,6,1
2,3,4,5,6,1
2,3,4,5,6,1
The columns randomly re-order in the middle of the file, and the only way to know the order is to look at the last set of headers right before the data (Column1,Column2, etc.) (I’ve also simplified the data so that it’s easier to picture. In real life, there is no way to tell data apart as they are all large integer values that could really go into any column)
Obviously this isn’t very SQL Server friendly when it comes to using BULK INSERT, so I need to find a way to arrange all of the columns in a consistent order that matches my table’s column order in my SQL database. What’s the best way to do this? I’ve heard Python is the language to use, but I have never worked with it. Any suggestions/sample scripts in any language are appreciated.
A solution in python:
I would read line-by-line and look for headers. When I find a header, I use it to figure out the order (somehow). Then I pass that order to
itemgetterwhich will do the magic of reordering elements:Now you can call it as: