Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3628222
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T23:59:18+00:00 2026-05-18T23:59:18+00:00

I need help doing a few things with XPath in PHP. With any given

  • 0

I need help doing a few things with XPath in PHP.

With any given HTML, I need to:

  • Remove all tables and their contents
  • Remove everything after the first h1 tag
  • Keep only paragraphs (INCLUDING their inner HTML (links, lists, etc))

With regex, I got everything working perfectly. When I encountered nested tables, however, I decided that it is indeed foolish to parse HTML with regex.

Thanks so much!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T23:59:18+00:00Added an answer on May 18, 2026 at 11:59 pm

    With any given HTML, I need to:

    • Remove all tables and their contents

    • Remove everything after the first h1
    tag

    • Keep only paragraphs (INCLUDING
    their inner HTML (links, lists, etc))

    This can be done very easily with XSLT:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:h="http://www.w3.org/1999/xhtml" >
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>
    
     <!-- Copy every node except when overriden
          by another template -->
     <xsl:template match="node()|@*">
      <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
     </xsl:template>
    
     <!-- Remove all tables and their contents -->
     <xsl:template match="h:table"/>
    
     <!-- Remove everything after the first h1 -->
     <xsl:template match="node()[preceding::h:h1]"/>
    
     <!-- Keep only paragraphs (INCLUDING
          their inner HTML (links, lists, etc))
      -->
     <xsl:template match=
     "node()[not(self::h:p) and not(ancestor::h:p)]">
      <xsl:apply-templates/>
     </xsl:template>
    </xsl:stylesheet>
    

    In case your element names are not in the XHtml namespace, simple delete any occurence of h: in the above code.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need help doing the following: a preprocessor macro label(x) shall output #x, e.g.,
I am doing this first time so I need some help. I am using
First of all I am in DESPERATE need of help here PLEASE I will
I need to do a few db things and I would rather have extension
Need help. It may be weird. First activity has listview(like lazyadapter) once i click
need help to create regular expression matching string www.*.abc.*/somestring Here * is wild card
Need help writing a script downloads data from google insight using c# this is
need help/guide for sql select query, I have 2 table stock and stock_history, in
need help regarding USSD Gateway. I have to develop an app, which will directly
need help in error in database pivot. i have table tamed table_score like below:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.