As any Python programmer knows, you should use == instead of is to compare two strings for equality. However, are there actually any cases where ( s is "" ) and ( s == "" ) will give different results in Python 2.6.2?
I recently came across code that used ( s is "" ) in code review, and while pointing out that this was incorrect I wanted to give an example of how this could fail. But try as I might, I can’t construct two empty strings with different identities. It seems that the Python implementation must special-case the empty string in lots of common operations. For example:
>>> a = ""
>>> b = "abc"[ 2:2 ]
>>> c = ''.join( [] )
>>> d = re.match( '()', 'abc' ).group( 1 )
>>> e = a + b + c + d
>>> a is b is c is d is e
True
However, this question suggests that there are cases where ( s is "" ) and ( s == "" ) can be different. Can anyone give me an example?
As everyone else has said, don’t rely on undefined behaviour. However, since you asked for a specific counterexample for Python 2.6, here it is:
The only time that Python 2.6 can end up with an empty string which is not the normal empty string is when it does a string operation and it isn’t sure about in advance how long the string will be. So when you encode a string the error handler can end up stripping characters and fixes up the buffer size after it has completed. Of course that’s an oversight and could easily change in Python 2.7.