Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6770301
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T15:17:46+00:00 2026-05-26T15:17:46+00:00

Its my first post on the site so bear with me Ok so i’m

  • 0

Its my first post on the site so bear with me

Ok so i’m a complete beginner with PHP and I have a specific need for it for my project. I’m hoping some of you guys could help!

Basically, I want to scrape a webpage and access a certain html table and its information. I need to parse out this info and simply format it in a desired result.

So where to begin….. heres my php I have written so far

<?php

$url = "http://www.goldenplec.com/festivals/oxegen-2/oxegen-2011";
$raw = file_get_contents($url);

$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
$content = str_replace($newlines, "", html_entity_decode($raw));

$start = strpos($content,'<table style="background: #FFF; font-size: 13px;"');
$end = strpos($content,'</table>',$start) + 8;

$table = substr($content,$start,$end-$start);

echo $table;


/* Regex here to echo the desired result */


?>

That URL contains the table I need. My code will simply echo that exact table.

However, and heres my problem, I’m by no means a reg-ex expert and I need to display the data from the table in a certain format. I want to echo an xml file containing a number of sql insert statements as follows:

$xml_output .= "<statement>INSERT INTO timetable VALUES(1,'Black Eyed Peas','Main Stage','Friday', '23:15')</statement>";
$xml_output .= "<statement>INSERT INTO timetable VALUES(2,'Swedish House Mafia','Vodafone Stage','Friday', '23:30')</statement>";
$xml_output .= "<statement>INSERT INTO timetable VALUES(3,'Foo Fighters','Main Stage','Saturday', '23:25')</statement>";
$xml_output .= "<statement>INSERT INTO timetable VALUES(4,'Deadmau5','Vodafone Stage','Saturday', '23:05')</statement>";
$xml_output .= "<statement>INSERT INTO timetable VALUES(5,'Coldplay','Main Stage','Sunday', '22:25')</statement>";
$xml_output .= "<statement>INSERT INTO timetable VALUES(6,'Pendalum','Vodafone Stage','Sunday', '22:15')</statement>";

I hope I have provided enough info and I would greatly appreciate any help from you kind folk.

Thanks in advance.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T15:17:47+00:00Added an answer on May 26, 2026 at 3:17 pm

    You’re much better off using something like XPATH when doing scraping. I get all <TD> elements, identify that the venue is always UPPERCASE, so we can use that to our advantage. We also get a list of days, & some blank spaces, so I skip over those. I identify the start of the acts section via checking for ":", which denotes a time. Given that the event lasts for 3 days & the arrangement of the data interleaves acts for each day, I just increment the day & then reset it when it reaches the last day of the event.

    Possibly some character encoding issues going on here, perhaps, but didn’t feel like meddling with that too much. There are possibly more elegant solutions out there.

    Edit: Just noticed that not all acts are exactly interleaved by 3 days, so this will be more difficult to get the day of the event. The code below will not give accurate days for every act. Mainly “Little Green Cars” & “Touchwood”

    Edit2: The code is now updated & should parse all acts properly with correct date. The offending dates that have nothing scheduled are represented by two empty strings(""). We can detect these & increment our $day counter.

    <?php
    
    libxml_use_internal_errors(true);
    
    $url = "lineup2011.html";
    $rawHTML = file_get_contents($url);
    
    $dom = new DOMDocument();
    $dom->loadHTML($rawHTML);
    
    
    $xpath = new DOMXPath($dom);
    
    $nodeList = $xpath->query("//table//td");
    
    $nodeCount = 0;
    $venue = "";
    $day = 0;
    $acts = array();
    
    while ($nodeCount < $nodeList->length) {
        $value = $nodeList->item($nodeCount)->nodeValue;
    
        if (isUpper($value) && strpos($value, ":") === false && $value != "") {
            $venue = $value;
            $nodeCount += 7;
            $day = 0;
            continue;
        }
    
        if ($value == "" && $nodeList->item($nodeCount + 1)->nodeValue == "") {
            $day++;
            $nodeCount += 2;
            continue;
        }
    
        $act = array();
        $act['time'] = $value;
        $act['name'] = $nodeList->item($nodeCount + 1)->nodeValue;
        $act['venue'] = $venue;
    
        $act['day'] = $day % 3;
    
    
        $day++;
    
        $acts[] = $act;
        $nodeCount += 2;
    }
    
    print_r($acts);
    
    
    function isUpper($str) {
        return (strtoupper($str) == $str);
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i need to trim a string to its first 100 characters using jquery/javascript. also
I need to copy a node and its sub-nodes: first: one identical copy followed
PHP (among others) will execute the deepest function first, working its way out. For
Since C# doesn't have a before,after,last,first etc. as part of its foreach. The challenge
I have a problem. i want to send some data to my iframe.php file..
This is my first post, I'm new to this site, but I've been lurking
basically I have my site setup to display the title of each page/post in
this is my first post on this great source of programming information. I have
This is my first post, so please bear with me. I'll try and be
My first post here! My question relates to iframes and php mostly So... it's

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.