Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6930877
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T11:30:17+00:00 2026-05-27T11:30:17+00:00

here what i want to do : i have a string containing HTML tags

  • 0

here what i want to do : i have a string containing HTML tags and i want to cut it using the wordwrap function excluding HTML tags.

I’m stuck :

public function textWrap($string, $width)
{
    $dom = new DOMDocument();
    $dom->loadHTML($string);
    foreach ($dom->getElementsByTagName('*') as $elem)
    {
        foreach ($elem->childNodes as $node)
        {
            if ($node->nodeType === XML_TEXT_NODE)
            {
                $text = trim($node->nodeValue);
                $length = mb_strlen($text);
                $width -= $length;
                if($width <= 0)
                { 
                    // Here, I would like to delete all next nodes
                    // and cut the current nodeValue and finally return the string 
                }
            }
        }
    }
}

I’m not sure i’m doing it in the right way at the moment. I hope it’s clear…

EDIT :

Here an example. I have this text

    <p>
        <span class="Underline"><span class="Bold">Test to be cut</span></span>
   </p><p>Some text</p>

Let’s say I want to cut it at the 6th character, I would like to return this :

<p>
    <span class="Underline"><span class="Bold">Test to</span></span>
</p>
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T11:30:17+00:00Added an answer on May 27, 2026 at 11:30 am

    As I wrote in a comment, you first need to find the textual offset where to do the cut.

    First of all I setup a DOMDocument containing the HTML fragment and then selecting the body which represents it in the DOM:

    $htmlFragment = <<<HTML
    <p>
            <span class="Underline"><span class="Bold">Test to be cut</span></span>
       </p><p>Some text </p>
    HTML;
    
    $dom = new DOMDocument();
    $dom->loadHTML($htmlFragment);
    $parent = $dom->getElementsByTagName('body')->item(0);
    if (!$parent)
    {
        throw new Exception('Parent element not found.');
    }
    

    Then I use my TextRange class to find the place where the cut needs to be done and I use the TextRange to actually do the cut and locate the DOMNode that should become the last node of the fragment:

    $range = new TextRange($parent);
    
    // find position where to cut the HTML textual represenation
    // by looking for a word or the at least matching whitespace
    // with a regular expression. 
    $width = 17;
    $pattern = sprintf('~^.{0,%d}(?<=\S)(?=\s)|^.{0,%1$d}(?=\s)~su', $width);
    $r = preg_match($pattern, $range, $matches);
    if (FALSE === $r)
    {
        throw new Exception('Wordcut regex failed.');
    }
    if (!$r)
    {
        throw new Exception(sprintf('Text "%s" is not cut-able (should not happen).', $range));
    }
    

    This regular expression finds the offset where to cut things in the textual representation made available by $range. The regex pattern is inspired by another answer which discusses it more detailed and has been slightly modified to fit this answers needs.

    // chop-off the textnodes to make a cut in DOM possible
    $range->split($matches[0]);
    $nodes = $range->getNodes();
    $cutPosition = end($nodes);
    

    As it can be possible that there is nothing to cut (e.g. the body will become empty), I need to deal with that special case. Otherwise – as noted in the comment – all following nodes need to be removed:

    // obtain list of elements to remove with xpath
    if (FALSE === $cutPosition)
    {
        // if there is no node, delete all parent children
        $cutPosition = $parent;
        $xpath = 'child::node()';
    }
    else
    {
        $xpath = 'following::node()';
    }
    

    The rest is straight forward: Query the xpath, remove the nodes and output the result:

    // execute xpath
    $xp = new DOMXPath($dom);
    $remove = $xp->query($xpath, $cutPosition);
    if (!$remove)
    {
        throw new Exception('XPath query failed to obtain elements to remove');
    }
    
    // remove nodes
    foreach($remove as $node)
    {
        $node->parentNode->removeChild($node);
    }
    
    // inner HTML (PHP >= 5.3.6)
    foreach($parent->childNodes as $node)
    {
        echo $dom->saveHTML($node);
    }
    

    The full code example is available on viper codepad incl. the TextRange class. The codepad has a bug so it’s result is not properly (Related: XPath query result order). The actual output is the following:

    <p>
            <span class="Underline"><span class="Bold">Test to</span></span></p>
    

    So take care you have a current libxml version (normally the case) and the output foreach at the end makes use of a PHP function saveHTML which is available with that parameter since PHP 5.3.6. If you don’t have that PHP version, take some alternative like outlined in How to get the xml content of a node as a string? or a similar question.

    When you closely look in my example code you might notice that the cut length is quite large ($width = 17;). That is because there are many whitespace characters in front of the text. This could be tweaked by making the regular expression drop any number of whitespace in fron t of it and/or by trimming the TextRange first. The second option does need more functionality, I wrote something quick that can be used after creating the initial range:

    ...
    $range = new TextRange($parent);
    $trimmer = new TextRangeTrimmer($range);
    $trimmer->trim();
    ...
    

    That would remove the needless whitespace on left and right inside your HTML fragment. The TextRangeTrimmer code is the following:

    class TextRangeTrimmer
    {
        /**
         * @var TextRange
         */
        private $range;
    
        /**
         * @var array
         */
        private $charlist;
    
        public function __construct(TextRange $range, Array $charlist = NULL)
        {
            $this->range = $range;
            $this->setCharlist($charlist);      
        }
        /**
         * @param array $charlist list of UTF-8 encoded characters
         * @throws InvalidArgumentException
         */
        public function setCharlist(Array $charlist = NULL)
        {
             if (NULL === $charlist)
                $charlist = str_split(" \t\n\r\0\x0B")
            ;
    
            $list = array();
    
            foreach($charlist as $char)
            {
                if (!is_string($char))
                {
                    throw new InvalidArgumentException('Not an Array of strings.');
                }
                if (strlen($char))
                {
                    $list[] = $char; 
                }
            }
    
            $this->charlist = array_flip($list);
        }
        /**
         * @return array characters
         */
        public function getCharlist()
        {
            return array_keys($this->charlist);
        }
        public function trim()
        {
            if (!$this->charlist) return;
            $this->ltrim();
            $this->rtrim();
        }
        /**
         * number of consecutive charcters of $charlist from $start to $direction
         * 
         * @param array $charlist
         * @param int $start offset
         * @param int $direction 1: forward, -1: backward
         * @throws InvalidArgumentException
         */
        private function lengthOfCharacterSequence(Array $charlist, $start, $direction = 1)
        {
            $start = (int) $start;              
            $direction = max(-1, min(1, $direction));
            if (!$direction) throw new InvalidArgumentException('Direction must be 1 or -1.');
    
            $count = 0;
            for(;$char = $this->range->getCharacter($start), $char !== ''; $start += $direction, $count++)
                if (!isset($charlist[$char])) break;
    
            return $count;
        }
        public function ltrim()
        {
            $count = $this->lengthOfCharacterSequence($this->charlist, 0);
    
            if ($count)
            {
                $remainder = $this->range->split($count);
                foreach($this->range->getNodes() as $textNode)
                {
                    $textNode->parentNode->removeChild($textNode);
                }
                $this->range->setNodes($remainder->getNodes());
            }
    
        }
        public function rtrim()
        {
            $count = $this->lengthOfCharacterSequence($this->charlist, -1, -1);
    
            if ($count)
            {
                $chop = $this->range->split(-$count);
                foreach($chop->getNodes() as $textNode)
                {
                    $textNode->parentNode->removeChild($textNode);
                }
            }
        }
    }
    

    Hope this is helpful.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using python, and I want a function that takes a string containing a
I have this html code that i want to edit with jQuery. Here is
If I have input file containing statementes asda rertte something nothing here I want
I have a string containing the value 12,13 I want 2 strings from this
I have a java function to extract a string out of the HTML Page
I have a following string and I want to extract image123.jpg. ..here_can_be_any_length and_here_any_length/image123.jpg and_here_also_any_length
I work for a travel agency, they have a website, from here they want
Here's the scenario: I have a set of buttons that I want to bind
I want to be able to have default text like Enter content here... appear
Here's the issue: I have a list of App names that I want to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.