I would like to have a regex pattern to match smileys “:)” ,”:(” .Also it should capture repeated smileys like “:) :)” , “:) :(” but filter out invalid syntax like “:( (” .
I have this with me, but it matches “:( (”
bool( re.match("(:\()",str) )
I maybe missing something obvious here, and I’d like some help for this seemingly simple task.
I think it finally “clicked” exactly what you’re asking about here. Take a look at the below:
The problem with your original code is that your regex is wrong:
(:\(). Let’s break it down.The outside parentheses are a “grouping”. They’re what you’d reference if you were going to do a string replacement, and are used to apply regex operators on groups of characters at once. So, you’re really saying:
(begin a group:\(… do regex stuff …The
:isn’t a regex reserved character, so it’s just a colon. The\is, and it means “the following character is literal, not a regex operator”. This is called an “escape sequence”. Fully parsed into English, your regex says(begin a group:a colon character\(a left parenthesis character)end the groupThe regex I used is slightly more complex, but not bad. Let’s break it down:
^(:\(|:\))+$.^and$mean “the beginning of the line” and “the end of the line” respectively. Now we have …^beginning of line(:\(|:\))+… do regex stuff …$end of line… so it only matches things that comprise the entire line, not simply occur in the middle of the string.
We know that
(and)denote a grouping.+means “one of more of these”. Now we have:^beginning of line(start a group:\(|:\)… do regex stuff …)end the group+match one or more of this$end of lineFinally, there’s the
|(pipe) operator. It means “or”. So, applying what we know from above about escaping characters, we’re ready to complete the translation:^beginning of line(start a group:a colon character\(a left parenthesis character|or:a colon character\)a right parenthesis character)end the group+match one or more of this$end of lineI hope this helps. If not, let me know and I’ll be happy to edit my answer with a reply.