Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7609495
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T01:11:07+00:00 2026-05-31T01:11:07+00:00

I would like to run a str_replace or preg_replace which looks for certain words

  • 0

I would like to run a str_replace or preg_replace which looks for certain words (found in $glossary_terms) in my $content and replaces them with links (like <a href="/glossary/initial/term">term</a>).

However, the $content is full HTML and my links/images are being affected too, which isn’t what I’m after.

An example of $content is:

<div id="attachment_542" class="wp-caption alignleft" style="width: 135px"><a href="http://www.seriouslyfish.com/dev/wp-content/uploads/2011/12/Amazonas-English-1.jpg"><img class="size-thumbnail wp-image-542" title="Amazonas English" src="http://www.seriouslyfish.com/dev/wp-content/uploads/2011/12/Amazonas-English-1-288x381.jpg" alt="Amazonas English" width="125" height="165" /></a><p class="wp-caption-text">Amazonas Magazine - now in English!</p></div>
<p>Edited by Hans-Georg Evers, the magazine &#8216;Amazonas&#8217; has been widely-regarded as among the finest regular publications in the hobby since its launch in 2005, an impressive achievment considering it&#8217;s only been published in German to date. The long-awaited English version is just about to launch, and we think a subscription should be top of any serious fishkeeper&#8217;s Xmas list&#8230;</p>
<p>The magazine is published in a bi-monthly basis and the English version launches with the January/February 2012 issue with distributors already organised in the United States, Canada, the United Kingdom, South Africa, Australia, and New Zealand. There are also mobile apps availablen which allow digital subscribers to read on portable devices.</p>
<p>It&#8217;s fair to say that there currently exists no better publication for dedicated hobbyists with each issue featuring cutting-edge articles on fishes, invertebrates, aquatic plants, field trips to tropical destinations plus the latest in husbandry and breeding breakthroughs by expert aquarists, all accompanied by excellent photography throughout.</p>
<p>U.S. residents can subscribe to the printed edition for just $29 USD per year, which also includes a free digital subscription, with the same offer available to Canadian readers for $41 USD or overseas subscribers for $49 USD. Please see the <a href="http://www.amazonasmagazine.com/">Amazonas website</a> for further information and a sample digital issue!</p>
<p>Alternatively, subscribe directly to the print version <a href="https://www.amazonascustomerservice.com/subscribe/index2.php">here</a> or digital version <a href="https://www.amazonascustomerservice.com/subscribe/digital.php">here</a>. Just gonna add this to the end of the post so I can do some testing.</p>

I came across this link, but I wasn’t sure if such a method would work with nested HTML.

Is there any way I can str_replace or preg_replace content within <p> tags only; excluding any nested <a>, <img> or <h1/2/3/4/5> tags?

Thanks in advance,

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T01:11:08+00:00Added an answer on May 31, 2026 at 1:11 am

    A “by-the-book solution” would be like this:

    <?php
    
    $html = "<your HTML string>";
    $glossary_terms = array('fishes', 'invertebrates', 'aquatic plants');
    
    $dom = new DOMDocument;
    $dom->loadHTML($html);
    
    dom_link_glossary($dom, $glossary_terms);
    
    echo $dom->saveHTML();
    
    // wraps all occurrences of the glossary terms in links
    function dom_link_glossary(&$document, &$glossary) {
      $xpath   = new DOMXPath($document);
      $urls    = array();
      $pattern = array();
    
      // build a normalized lookup (case-insensitive, whitespace-agnostic)
      foreach ($glossary as $term) {
        $term_norm = preg_replace('/\s+/', ' ', strtoupper(trim($term)));
        $pattern[] = preg_replace('/ /', '\\s+', preg_quote($term_norm));
        $urls[$term_norm] = '/glossary/initial/' . rawurlencode($term);
      }
    
      $pattern  = '/\b(' . implode('|', $pattern) . ')\b/i';
      $text_nodes = $xpath->query('//text()[not(ancestor::a)]');
    
      foreach($text_nodes as $original_node) {
        $text     = $original_node->nodeValue;
        $hitcount = preg_match_all($pattern, $text, $matches, PREG_OFFSET_CAPTURE);
    
        if ($hitcount == 0) continue;
    
        $offset   = 0;
        $parent   = $original_node->parentNode;
        $refnode  = $original_node->nextSibling;
    
        $parent->removeChild($original_node);
    
        foreach ($matches[0] as $i => $match) {
          $term_txt = $match[0];
          $term_pos = $match[1];
          $term_norm = preg_replace('/\s+/', ' ', strtoupper($term_txt));
    
          // insert any text before the term instance
          $prefix = substr($text, $offset, $term_pos - $offset);
          $parent->insertBefore($document->createTextNode($prefix), $refnode);
    
          // insert the actual term instance as a link
          $link = $document->createElement("a", $term_txt);
          $link->setAttribute("href", $urls[$term_norm]);
          $parent->insertBefore($link, $refnode);
    
          $offset = $term_pos + strlen($term_txt);
    
          if ($i == $hitcount - 1) {  // last match, append remaining text
            $suffix = substr($text, $offset);
            $parent->insertBefore($document->createTextNode($suffix), $refnode);
          }
        }
      }
    }
    ?>
    

    Here is how dom_link_glossary() works:

    • It normalizes the glossary terms (trim, uppercase, white-space) and builds a lookup array and a regex pattern that matches all terms.
    • It uses XPath to find all text nodes that are not already part of a link. Text nodes are returned irrespective of their nesting depth (i.e. no recursion necessary on our part). I use \b to prevent partial matches.
    • For each text node that contains terms:
      • The original text node is deleted ($parent->removeChild())
      • Now new nodes are created and inserted into the DOM: text nodes for anything before (or after) a glossary term, element nodes (<a>) for the actual glossary terms.

    The solution preserves original case and white space, therefore

    • term will become <a href="/glossary/initial/term">term</a>
    • Term will become <a href="/glossary/initial/term">Term</a>
    • Foo Bar will become <a href="/glossary/initial/foo%20bar">Foo Bar</a>. Surplus whitespace or line breaks in the HTML will not break the mechanism.

    Note that it is perfectly all-right to use regex on the plain text node values. It is not okay to use regex on full HTML.

    I would recommend pairing the glossary terms with their respective URLs in an array, instead of calculating the URLs in the function. That way you can make multiple terms point to the same URL.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I would like to run multiple copies of my application and force the first
I would like to run a job through cron that will be executed every
I would like to run a task if any file in an item list
I would like to run the following SQL select: SELECT ID, NUMERATOR, (SELECT m.COLUMNNAME
I would like to run APIDemos under platforms\android-1.5\samples directory on Android emulator but I
I would like to run a (extensive) query that produces one line of XML.
I would like to run a timer for every 5 hours and delete the
I would like to run a process from Python (2.4/2.5/2.6) using Popen , and
I would like to run something like: select * from table where field in
I would like to run some code onload of a form in WPF. Is

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.