I am very weak at regular expressions, now I’m debugging some code, the code is searching strings with an expression like:
r"coding[:=]\s*([-\w.]+)"
What kind of string does it search for?
To me, it seems to match something like:
coding= xxxxx
but I don’t know the exact meaning of the mystery character. Can anyone explain in a bit more detail?
Let’s break this down:
coding: literal text match, only the word “coding” will do[:=]: character group, either a colon “:” or an equals sign “=” matches\s*: 0 or more whitespace characters; spaces and tabs, but could match newlines too if so configured.(..): a matching group, the contents will be available as a match group for further processing.[-\w.]+: one or more characters in the group, matching a dash “-“, a dot “.” or any word character;\wis a character class usually matching the letters ‘a’ through ‘z’ (upper and lowercase), numbers ‘0’ up to ‘9’ and the underscore “_”.If you switch on unicode support (on by default in Python 3) the
\wclass captures a lot more though. Any character classified as an alphanumeric in the Unicode database would match.Examples of matches:
foobar320_42spam_eggssomething-or-otherwhatever.42