I have a Mac running Lion and Python 2.7.1. I am noticing something very strange from the re module. If I run the following line:
print re.split(r'\s*,\s*', 'a, b,\nc, d, e, f, g, h, i, j, k,\nl, m, n, o, p, q, r')
I get this result:
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r']
But if I run it with the re.DOTALL flag like this:
print re.split(r'\s*,\s*', 'a, b,\nc, d, e, f, g, h, i, j, k,\nl, m, n, o, p, q, r', re.DOTALL)
Then I get this result:
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q, r']
Note that ‘q, r’ is counted as one match instead of two.
Why is this happening? I don’t see why the re.DOTALL flag would make a difference if I am not using dots in my pattern. Am I doing something wrong or is there some sort of bug?
The problem is that you are passing
re.DOTALLpositionally, where it sets themaxsplit=0argument, not theflags=0argument.re.DOTALLhappens to be the constant16.