I am hoping someone can help me with this regex. I’ve only used it to collect single words in a string, so I’m not sure how to handle multiple lines and what look like ASCII characters.
Here is the text block:
Information - RETAILEAITRT00003 - Traitement - Processing - ---> Recovery from 05/09/2012 at 09:17:50 AM
Information - RETAILEAITRT00020 - Traitement - Processing - ---> Information recovery starts on 05/09/2012 at 09:17:50 AM
Information - RETAILEAITRT00021 - Traitement - Processing - ----> File processing: C:\Program Files (x86)\Prog\Prog RIT\Web Orders\live\Prog Import\Order_110039354.tab
Information - RETAILEAITRT00005 - Traitement - Processing - ---> End of information recovery on 05/09/2012 at 09:17:51 AM
Information - RETAILEAITRT00006 - Traitement - Processing - -> 6 records read
Information - RETAILEAITRT00008 - Traitement - Processing - -> 6 records processed
Information - RETAILEAITRT00010 - Traitement - Processing - -> 6 integrated records
Information - RETAILEAITRT00015 - Traitement - Processing - -> No integration errors
Information - RETAILEAITRT00020 - Traitement - Processing - ---> Information recovery starts on 05/09/2012 at 09:17:51 AM
Information - RETAILEAITRT00021 - Traitement - Processing - ----> File processing: C:\Program Files (x86)\Prog\Prog RIT\Web Orders\live\Prog Import\Order_110039355.tab
Third-party - : La raison sociale doit �tre renseign�e
Third-party - _SHIP : La raison sociale doit �tre renseign�e
Erreur - RETAILEAIDOC00008 - Document - Document - address The internal reference enables the recovery of a document. It is mandatory
Erreur - RETAILEAIDOC00008 - Document - Document - address The internal reference enables the recovery of a document. It is mandatory
Information - RETAILEAITRT00005 - Traitement - Processing - ---> End of information recovery on 05/09/2012 at 09:17:52 AM
Information - RETAILEAITRT00006 - Traitement - Processing - -> 4 records read
Information - RETAILEAITRT00008 - Traitement - Processing - -> 4 records processed
Information - RETAILEAITRT00012 - Traitement - Processing - -> No records integrated
Information - RETAILEAITRT00013 - Traitement - Processing - -> 4 records contain errors
Information - RETAILEAITRT00003 - Traitement - Processing - ---> Recovery from 05/09/2012 at 09:33:03 AM
Information - RETAILEAITRT00020 - Traitement - Processing - ---> Information recovery starts on 05/09/2012 at 09:33:03 AM
Information - RETAILEAITRT00021 - Traitement - Processing - ----> File processing: C:\Program Files (x86)\Prog\Prog RIT\Web Orders\live\Prog Import\Order_110039356.tab
Information - RETAILEAITRT00005 - Traitement - Processing - ---> End of information recovery on 05/09/2012 at 09:33:05 AM
Information - RETAILEAITRT00006 - Traitement - Processing - -> 6 records read
Information - RETAILEAITRT00008 - Traitement - Processing - -> 6 records processed
Information - RETAILEAITRT00010 - Traitement - Processing - -> 6 integrated records
Information - RETAILEAITRT00015 - Traitement - Processing - -> No integration errors
Information - RETAILEAITRT00020 - Traitement - Processing - ---> Information recovery starts on 05/09/2012 at 09:33:05 AM
Information - RETAILEAITRT00021 - Traitement - Processing - ----> File processing: C:\Program Files (x86)\Prog\Prog RIT\Web Orders\live\Prog Import\Order_110039357.tab
Information - RETAILEAITRT00005 - Traitement - Processing - ---> End of information recovery on 05/09/2012 at 09:33:06 AM
Information - RETAILEAITRT00006 - Traitement - Processing - -> 6 records read
Information - RETAILEAITRT00008 - Traitement - Processing - -> 6 records processed
Information - RETAILEAITRT00010 - Traitement - Processing - -> 6 integrated records
Information - RETAILEAITRT00015 - Traitement - Processing - -> No integration errors
However, I only want this segment:
Information - RETAILEAITRT00020 - Traitement - Processing - ---> Information recovery starts on 05/09/2012 at 09:17:51 AM
Information - RETAILEAITRT00021 - Traitement - Processing - ----> File processing: C:\Program Files (x86)\Prog\Prog RIT\Web Orders\live\Prog Import\Order_110039355.tab
Third-party - : La raison sociale doit �tre renseign�e
Third-party - _SHIP : La raison sociale doit �tre renseign�e
Erreur - RETAILEAIDOC00008 - Document - Document - address The internal reference enables the recovery of a document. It is mandatory
Erreur - RETAILEAIDOC00008 - Document - Document - address The internal reference enables the recovery of a document. It is mandatory
Information - RETAILEAITRT00005 - Traitement - Processing - ---> End of information recovery on 05/09/2012 at 09:17:52 AM
Information - RETAILEAITRT00006 - Traitement - Processing - -> 4 records read
Information - RETAILEAITRT00008 - Traitement - Processing - -> 4 records processed
Information - RETAILEAITRT00012 - Traitement - Processing - -> No records integrated
Information - RETAILEAITRT00013 - Traitement - Processing - -> 4 records contain errors
There are also some special characters there that appear as weird question marks. I just don’t really know where to start with it really.. I guess it will have to look for ^Erreur, then grab the lines above and below it until it finds ^ with white space…?
Thanks
I was able to get this to work with the following regular expression:
Note: this requires the use of the “g” (global) flag (tested successfuly in JavaScript). Not sure which language you’re using, but it should have an equivalent flag.
And yes, it’s extremely ugly :). Here’s basically what it’s looking for: