I import some text from an XML file and I trim it and replace

Question

0

Asked: June 10, 20262026-06-10T08:40:25+00:00 2026-06-10T08:40:25+00:00

I import some text from an XML file and I trim it and replace

0

I import some text from an XML file and I trim it and replace multiple white spaces.

$var = $myxmltext;
$var = trim($var);
$var = preg_replace('/\s+/',' ',$var);

For some reason I get “raw html” like this when I echo it:

quot; or IÂ’ve instead of I've

Any ideas why?

Here is my trim function:

function mytrim($mytrim){
    $mytrim = utf8_decode($mytrim); 
    $mytrim = trim($mytrim);
    $rule1 = array(
        ",",    // virgula
        ".",    // punct
        "~",    // ~
        "_",    // underscore
        "-",    // liniuta
        ")",    // paranteza inchidere
        ":",    // doua puncte
        ">",    // mai mare
        "<",    // mai mic
        "!",
        "?",
        "*",
        "&"
    );
    $rule2 = array(
        ", ",   // virgula
        ". ",   // punct
        " ~ ",  // ~
        " ",    // underscore
        " - ",  // liniuta
        ") ",   // paranteza inchidere
        ": ",   // doua puncte
        " > ",  // mai mare
        " < ",  // mai mic
        "! ",
        "? ",   
        "* ",
        " & "
    );
    $mytrim = str_replace($rule1, $rule2, $mytrim);
    $rule3 = array(
        " .",   // punct
        " ,",   // virgula
        " ?",   // question mark
        " !",
        " *",
        " )"
    );
    $rule4 = array(
        ".",    // punct
        ",",    // virgula
        "?",    // question mark
        "!",
        "*",
        ")"
    );
    $mytrim = str_replace($rule3, $rule4, $mytrim);
    $mytrim = preg_replace('/\s+/',' ',$mytrim);
    return $mytrim;
}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T08:40:27+00:00

Try this regex before you do you stuff:

preg_replace('/(&)\s+(\w+;)/', '$1$2', $text);

Then do your business, lets see if that HTML encodes right now.

So what this will do is solve your main problem of HTML encoding by changing all:

& quote;

to:

&quote;

Be aware: This might not work exactly as expected so please test.

Of course as others say you can also utf8_decode/encode as well to get rid of those umlet characters.

Edit

To solve the Ampersand problem try:

preg_replace('/&(?!\w+;)/', ' & ', $text);

So this will replace all & that is not in the form &quote; and give them a space either side.

Same as normal, test it first.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I import some text from an XML file and I trim it and replace

Leave an answerCancel reply

1 Answer

Edit

Leave an answer
Cancel reply