I want to create a regular expression that finds the word tjuv (thief in swedish), which can be assembled with other words (see below for examples) and/or come in different conjugations.
Examples:
- cykeltjuv
- biltjuv
- tjuvarna
- inbrottstjuvs
The one below works for tjuv and tjuvs (a thief’s), but what about the other conjugations as well as combinations with other words?
/tjuv(?:s){0,1}/ig
Now that I’ve learned you a little swedish it’s fair that you learn me some regular expressions 😉
EDIT: To be more specific, there’s actually no case I can think of that shouldn’t match with the word tjuv.
What I am doing is searching through phrases where the word tjuv exists, for example (translated to english):
1. När en familj kom hem från en utlandssemester upptäckte de att en inbrottstjuv
hade varit i farten. <- MATCH!
2. På juldagen hade en cykeltjuv varit framme och stulit en cykel. <- MATCH
3. Violer är blå och rosor är röda <- No 'tjuv' and therefor no match
I think this is what you want, the word "tjuv" with other letters before and/or ahead:
See it here on Regexr
But
[a-z]is a character class covering only the ASCII characters a to z (Case independent because of theimodifier). But I think swedish has also some characters that are not included in that range.So either you
or
dependend on your regex flavour you can use
\p{L}instead.\p{L}is a Unicode code point, matching every letter in any language. Would then look like: