I need to parse very large log files (>1Gb, <5Gb) – actually I need to strip the data into objects so I can store them in a DB. The log file is sequential (no line breaks), like:
TIMESTAMP=20090101000000;PARAM1=Value11;PARAM2=Value21;PARAM3=Value31;TIMESTAMP=20090101000100;PARAM1=Value11;PARAM2=Value21;PARAM3=Value31;TIMESTAMP=20090101000152;PARAM1=Value11;PARAM2=Value21;PARAM3=Value31;…
I need to strip this into the table:
TIMESTAMP | PARAM1 | PARAM2 | PARAM3
The process need to be as fast as possible. I’m considering using Perl, but any suggestions using C/C++ would be really welcome. Any ideas?
Best regards,
Arthur
Write a prototype in Perl and compare its performance against how fast you can read data off of the storage medium. My guess is that you’ll be I/O bound, which means that using C won’t offer a performance boost.