I have a code that reads CSV files and store their content into DB. The code runs periodically and it should only read the newly added files. I thought of adding a flag in the first line of each file after reading it, but this will require loading all the files one by one and check their first lines to decide which one should be read.
Is there any better idea of doing it?
I have a code that reads CSV files and store their content into DB.
Share
If you’re on a Windows file-system (FAT, NTFS) there is a file attribute called “Archive” that is for this purpose. Any changes to the file cause it to be set, and you can clear it once you’ve added it to your DB.
For cross-platform purposes, the best option is to keep track of which files have been looked at (and optionally their last-modified dates) in the DB as well, then you can check while looking at the directory listing without having to open every file.