Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8625297
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T07:44:57+00:00 2026-06-12T07:44:57+00:00

Possible Duplicate: How to parse and process HTML with PHP? I need to parse

  • 0

Possible Duplicate:
How to parse and process HTML with PHP?

I need to parse blocks of HTML, replacing some hrefs with the link description based on whether the description meets certain criteria.

The regex I’m using to identify specific strings is used elsewhere in my application:

$regex  = "/\b[FfGg][\.][\s][0-9]{1,4}\b/";
preg_match_all($regex, $html, $matches, PREG_SET_ORDER);

I’m using the following SO question as a starting point for extracting href descriptions:

Replacing html link tags with a text description

The idea is to convert any link having a “FfGg.xxxx” type identifier, and leave the rest in tact (ie, the google link).

What I have so far is:

    $html = 'Ten reports <a href="http://google.com">Google!</a> on 14 mice with ABCD 
show that low plasma BCAA, particularly ABC and to a lesser extent DEF, can result in 
severe but reversible epithelial damage to the skin, eye and gastrointestinal tract.
</li><li>Symptoms were reported in conjunction with low plasma ABC levels in 9 case 
reports. In two case reports, ABC levels were between 1.9 and 48 µmol/L (<a 
href="/docpage.php?obscure==100" target="F.100">F.100</a>, <a 
href="/docpage.php?obscure==68" target="F.68">F.68</a>, <a href="/docpage.php?obscure==67" 
target="F.67">F.67</a>, <a href="/docpage.php?obscure==71" target="F.71">F.71</a>, <a 
href="/docpage.php?obscure==122" target="F.122">F.122</a>, <a 
href="/docpage.php?obscure==92" target="F.92">F.92</a>, <a href="/docpage.php?obscure==96" 
target="F.96">F.96</a>);';

This converts all links, including google:

$html = preg_replace("/<a.*?href=\"(.*?)\".*?>(.*?)<\/a>/i", "$2", $html);

This returns a blank HTML string:

$html = preg_replace("/<a.*?href=\"(.*?)\".*?>[FfGg][\.][\s][0-9]{1,4}<\/a>/i", "$2", $html);

I believe the problem is in how I’m embedding this regex in the second (non-working) example above:

[FfGg][\.][\s][0-9]{1,4}

What is the correct way of embedding the FfGg expression in HTML found in my preg_replace example above?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T07:44:59+00:00Added an answer on June 12, 2026 at 7:44 am

    Here is the DOM (correct) way to do it:

    EDIT: Improved regex

    <?php
    
        $html = 'Ten reports <a href="http://google.com">Google!</a> on 14 mice with ABCD show that low plasma BCAA, particularly ABC and to a lesser extent DEF, can result in severe but reversible epithelial damage to the skin, eye and gastrointestinal tract.</li><li>Symptoms were reported in conjunction with low plasma ABC levels in 9 case reports. In two case reports, ABC levels were between 1.9 and 48 µmol/L (<a href="/docpage.php?obscure==100" target="F.100">F.100</a>, <a href="/docpage.php?obscure==68" target="F.68">F.68</a>, <a href="/docpage.php?obscure==67" target="F.67">F.67</a>, <a href="/docpage.php?obscure==71" target="F.71">F.71</a>, <a href="/docpage.php?obscure==122" target="F.122">F.122</a>, <a href="/docpage.php?obscure==92" target="F.92">F.92</a>, <a href="/docpage.php?obscure==96" target="F.96">F.96</a>);';
    
        // Create a new DOMDocument and load the HTML string
        $dom = new DOMDocument('1.0');
        $dom->loadHTML($html);
    
        // Create an XPath object for this DOMDocument
        $xpath = new DOMXPath($dom);
    
        // Loop over all <a> elements in the document
        // Ideally we would combine the regex into the XPath query, but XPath 1.0
        // doesn't support it
        foreach ($xpath->query('//a') as $anchor) {
            // See if the link matches the pattern
            if (preg_match('/^\s*[gf]\s*\.\s*\d{1,4}\s*$/i', $anchor->nodeValue)) {
                // If it does, convert it to a text node (effectively, un-linkify it)
                $textNode = new DOMText($anchor->nodeValue);
                $anchor->parentNode->replaceChild($dom->importNode($textNode), $anchor);
            }
        }
    
        // Because you are working with partial HTML string, I extract just that
        // string. If you are actually working with a full document, you can
        // replace all the code below this comment with simply:
        // $result = $dom->saveHTML();
    
        // A string to hold the result
        $result = '';
    
        // Iterate all elements that are a direct child of the <body> and convert
        // them to strings
        foreach ($xpath->query('/html/body/*') as $node) {
            $result .= $node->C14N();
        }
    
        // $result now contains the modified HTML string
    

    See it working (NB: the error message you see is because the HTML string you supplied is not valid)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Possible Duplicate: How to parse and process HTML with PHP? I am brand new
Possible Duplicate: How to parse and process HTML with PHP? I have HTML document
Possible Duplicate: How to parse and process HTML/XML with PHP? I want to grab
Possible Duplicate: How to parse and process HTML with PHP? I'm not very good
Possible Duplicate: How to parse and process HTML with PHP? $content = <p>This is
Possible Duplicate: How to parse and process HTML with PHP? Here is what I'm
Possible Duplicate: How to parse and process HTML with PHP? I'm looking into HTML
Possible Duplicate: How to parse and process HTML with PHP? I'm trying to use
Possible Duplicate: How to parse and process HTML with PHP? Hi there I have
Possible Duplicate: Help me parse this file with PHP I need to extract some

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.