I have an application that I’ve been working on and found a troubling difference tonight – and I thought I would document it here and see if anyone can replicate it and/or explain it. The query is made up, but demonstrates the problem:
select
'123' ~ '^\d+$' as result_1,
'123' ~ '^[0-9]+$' as result_2
I have a PostgreSQL v9.1 running on Windows 7 and when I run this query I get:
T, T
However, when I run the query on PostgreSQL v9.0 on a Ubuntu 10.04 I get:
F, T
So, it would appear that either PostgreSQL changed between v9.0 and v9.1 in its handling of “\d” or it has something to with differences between the libs that got installed between Windows and Ubuntu.
Either way, I think that folk should be aware that your check constraints, etc might not behave the same between the two (mine sure didn’t).
Note: unfortunately, I don’t have easy access to a Windows 7 box running 9.0 or I would test it there, also.
Can anyone explain this? If it’s well known, please forgive me. I didn’t see an answer when I googled for it. Obviously the safe thing to do is to just use [0-9] because it works in both locations. But, again, I would like to know why this is happening.
You have an escaping problem. From the fine 9.1 manual on string quoting:
So 9.1 sees
'\d'the same way C does so it just looks like'd'. In 9.1 you’d want to escape your backslash and use theE''“escape” string notation to get paststandard_conforming_strings:Or you could try dollar quoting:
but that’s pretty ugly and difficult to read with a regex (especially a regex that uses
$to anchor the end).Another option would be to use a POSIX character class instead of
\d:You should have been seeing warnings about
'\d'in earlier versions as well, check your logs for things like this: