Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1015679
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T10:26:12+00:00 2026-05-16T10:26:12+00:00

This is driving me nuts! A little piece of code that I can’t seem

  • 0

This is driving me nuts! A little piece of code that I can’t seem to debug 🙁 Basically I have an HTML file in a string and I want to find X inside until another X (same value) IF there is another one, if there isn’t, then grab X until end of file.

The code that doesn’t work:

$contents = "< div id="main" class="clearfix">    < div id="col-1">< div id="content">< div id="p19601634">< h1>< span id="ppt19601634">";
$regex = "!<div id="content">(.*?)(?:<div id="content">)!s";>
preg_match_all($regex, $contents, $matches);

Please notice that I added spaces before the DIV for display purpose and that I want to check with NEW LINES and TABS inside the HTML also (basically, there is a line return after the first DIV).

Right now, my code works if it finds many occurences of my search and it will return the searches. But if there is only one item found, it doesnt work.

Does someone knows this?

Thanks a bunch

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T10:26:12+00:00Added an answer on May 16, 2026 at 10:26 am

    Regular expressions are not and never will be the right tool for this job. “I have to use regular expressions” is not true. There is computer science theory to explain this: regular expressions are only capable of matching regular languages, but HTML (or XML) is a more sophisticated language than that.

    Another solution for you besides DOM mentioned in @meder’s answer is XSLTProcessor. XSLT is a declarative pattern-matching language like regular expressions. But XSLT is capable of matching the hierarchical structure of XHTML or XML.

    See the answers in Simple XML parsing on PHP for more solutions, including an example of XSLTProcessor in my answer.

    If you want to learn all about HTML scraping techniques in PHP, there’s a book on the subject by Matthew Turland, titled php|architect’s Guide to Web Scraping with PHP. It’s available in digital form now, and should be in print soon.

    If you can pry yourself away from PHP for a moment, try a package called Beautiful Soup. This package has one huge advantage: unlike DOM/XSLT parsers, Beautiful Soup doesn’t choke if you direct it to parse an HTML page that has some bad markup. Since most web sites you will be scraping probably contain some mistakes, this is a pretty important advantage.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This is driving me nuts. Here is some HTML and CSS that I have
This is driving me nuts. I have the following code that, when a button
This is driving me nuts. I can't seem to get the data template within
This is driving me nuts. I have two tables that I am attempting to
This is driving me nuts, I have a login function that checks to make
I'm hoping someone can help with this, as it's driving me absolutely nuts. I
This is driving me nuts. I have a working text based application. It has
This is driving me nuts, I can't see the problem. Using jQuery (I'm stuck
This is driving me nuts! I have a test instance of Wordpress installed on
This is driving me nuts!@#!@# I can load the tinyMce plugin for jquery just

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.