I’ve searched but didn’t quite find something for my case. Basically, I’m trying to split the following line:
(CU!DIVD:WEXP:DIVD-:DIVD+:RWEXP:RDIVD:RECL:RLOSS:MISCDI:WEXP-:INT:RGAIN:DIVOP:RRGAIN:DIVOP-:RDIVOP:RRECL:RBRECL:INT -:RRLOSS:INT +:RINT:RDIVD-:RECL-:RWXPOR:WEXPOR:MISCRE:WEXP+:RWEXP-:RBWEXP:RECL+:RRECL-:RBDIVD)
You can read this as CU is NOT DIVD or WEXP or DIV- or and so on. What I’d like to do is split this line if it’s over 65 characters into something more manageable like this:
(CU!DIVD:WEXP:DIVD-:DIVD+:RWEXP:RDIVD:RECL:RLOSS:MISCDI:WEXP-)
(CU!INT:RGAIN:DIVOP:RRGAIN:DIVOP-:RDIVOP:RRECL:RBRECL:INT-)
(CU!RRLOSS:INT +:RINT:RDIVD-:RECL-:RWXPOR:WEXPOR:MISCRE:WEXP+)
(CU!RWEXP-:RBWEXP:RECL+:RRECL-:RBDIVD)
They’re all less than 65 characters. This can be stored in a list and I can take care of the rest. I’m starting to work on this with RegEx but I’m having a bit of trouble.
Additionally, it can also have the following conditionals:
- !
- <
- >
- =
- !=
- !<
- !>
As of now, I have this:
def FilterParser(iteratorIn, headerIn):
listOfStrings = []
for eachItem in iteratorIn:
if len(str(eachItem.text)) > 65:
exmlLogger.error('The length of filter' + eachItem.text + ' exceeds the limit and will be dropped')
pass
else:
listOfStrings.append(rightSpaceFill(headerIn + EXUTIL.intToString(eachItem),80))
return ''.join(stringArray)
Here is a solution using regex, edited to include the
CU!prefix (or any other prefix) to the beginning of each new line:First we need to grab the prefix, we do this using
re.search().group(0), which returns the entire match. Each of the final lines should be at most 65 characters, the regex that we will use to get these lines will not include the prefix or the closing parentheses, which is whymaxlenis64 - len(prefix).Now that we know the most characters we can match, the first part of the regex
(.{1,<maxlen>)will match at most that many characters. The portion at the end,(?:$|:), is used to make sure that we only split the string on semi-colons or at the end of the string. Since there is only one capturing groupregex.findall()will return only that match, leaving off the trailing semi-colon. Here is what it looks like for you sample string:The list comprehension is used to construct a list of all of the lines by adding the prefix and the trailing
)to each result. The slicing ofsis done so that the prefix and the trailing)are stripped off of the original string beforeregex.findall(). Hope this helps!