I have user details files which I get every month end.
File will have columns like id, f name, l name, address, phone, bus phone, hobbies, books
id is the unique key to identify an individual.
I need to maintain a database with information from this file.
Say in Jan the file had 100 users.
In Feb the file had 110 users. Means 10 new users.
So I will sort both the files on id and will now the new 10 users and will add them.
Issue is, I also want to check for changes to the existing ids as well.
So for id 3 in jan the address was xyz and in feb file it became pqr, I want to know it and update the database accordingly.
So — Most easy as well as efficient way to compare records in two files (fixed format) for knowing data change in columns ?
One way I could think of is having checksum for each record in both files and comparing them to know of the changes. But want to know if this is the correct way or is there a better approach ?
Well, you have the FileUtils.contentEquals method (http://commons.apache.org/io/apidocs/org/apache/commons/io/FileUtils.html). This will work well in cases where there is no time-based headers etc and the contents can be compared directly