I searched but have not found an answer to the question – maybe it is so obvious that no one else had to ask…
I am using UltraEdit 16.00 to run my Regular Expressions in PERL mode…
Situation:
I have a delimited string that can contain a variable number of repeating segments that must adhere to a very specific format. These segments occur randomly throughout the delimited string.
Example:
CLP*data*data*data~REF*data*data~N1*data*data*data~**CAS*OA*29*99.99**~AMT*I*99.99~SVC*data*data*data*data~**CAS*PR*99.99**~**CAS*CO**99.99**~DTM*150*date~AMT*B6*99.99~SVC*data*data*data*data~CAS*PR*N16*99.99~**CAS*CO* *99.99**...line continues from here.
Correct format – CAS*OA*29*99.99~
Incorrect format 1 – CAS*OA* *99.99~
Incorrect format 2 – CAS*OA**99.99~
Goal:
Identify only those strings where ALL of the CAS segments adhere to the format.
Things I’ve Tried:
(BTW: I know my Regular Expressions are not optimized, so please give me a break)
CAS Segment Missing value or containing one or more spaces
CAS\*(OA|PR|CR|CO)\*\*[-]?[\d]+\.?[\d]{0,2}~ matches the first instance if finds
CAS\*(OA|PR|CR|CO)\*[\s]+?\*[-]?[\d]+\.?[\d]{0,2}~ matches the first instance if finds
CAS segment NOT Missing value or containing space(s)
CAS\*(OA|PR|CR|CO)\*[^0-9A-Z]+?\*[-]?[\d]+\.?[\d]{0,2}~ Again, matches first instance
Negative Lookahead using combinations of the above (I am new to trying this approach)
^(?:(?!ab).)+$ – ab => one of the above regular expressions – never got it to work
Question:
How do I write the regular expression to enforce/validate the format of EVERY CAS instance no matter how often it occurs (there is a potential for 0 instances)?
To say that every CAS instance in your string is valid is to say that there does not exist at least one invalid CAS sequence. The approach you were getting at with a negative lookahead is the simplest way to represent this – here’s an example:
Basically: “Make sure there does not exist in the string an instance of CAS that is not followed by whatever matches a valid CAS instance”. Replace the contents of the second negative lookahead, and include whatever it is before ‘CAS’ that indicates the start of a CAS instance.
As you can see, you don’t need to match the string from start to finish to do what you want.