A website I’m modding with a userscript has some text I want to modify. The text appears to have a unicode character in it. When I look at it on screen or even extract it to a variable with jQuery, it looks like this:
2 others
However, if I create my own variable with that same text and then do a comparison, they come up as false. So I copied/pasted the site’s text into vim and it looks like this:
2<200e> others
Best I can tell this is a unicode character for space (?). I want to be able to match this string with a regex such as:
^(\d+(?:,\d+)*)\s+(.*)
but on this string with the embedded unicode character it fails. (it works fine on my own typed text of ‘2 others’).
Is there some way I can strip this unicode out of the text? I tried the following, to no avail:
text.replace('\u200e\','')
text.replace('200e','')
text.replace('\%20','')
text.replace('\%u200e','')
Or, alternatively, can I adjust my regex to match either ‘2 others’ or the same text with the embedded 200e unicode char?
Try to use an actual regex instead.
You could just change the
\sin your regex to include U+200E as well, e.g.