I have some xml files with figure spaces in it, I need to remove those with php.
The utf-8 code for these is e2 80 a9. If I’m not mistaken php does not seem to like 6 byte utf-8 chars, so far at least I’m unable to find a way to delete the figure spaces with functions like preg_replace.
Anybody any tips or even better a solution to this problem?
Have you tried
preg_replace('/\x{2007}/u', '', $stringWithFigureSpaces);?U+2007is the unicode codepoint for the FIGURE SPACE.Please see my answer on a similar unicode-regex topic with PHP which includes information about the
\x{FFFF}-syntax.Regarding you comment about the non-working – the following works perfectly on my machine:
What’s you PHP version? Are you sure the character is a FIGURE SPACE at all? Can you run the following snippet on your string?
On my test string this outputs
EDIT after OP comment:
\xe2\x80\xa9is a PARAGRAPH SEPARATOR which is unicode codepointU+2029, so your code should bepreg_replace('/\x{2029}/u', '', $stringWithUglyCharacter);