Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6945345
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T13:25:27+00:00 2026-05-27T13:25:27+00:00

I am using a combination of XMLReader and simpleXML to parse the Posts in

  • 0

I am using a combination of XMLReader and simpleXML to parse the Posts in a WordPress export file. I realize this is a little out of the norm but, its more of backup project, so we can easily pull up one of these articles if we need it in the futre. The WP site that they were on needs to come down.

The issue I am having is that some of the nodes in the XML file are empty or contain useless values (ie. Not full posts). I need to add some string length conditions but, I’m not sure how to check for each one.

<?php 

$path_to_xml_file = 'compress.zlib://wordpress.2011.xml.gz';


$reader = new XMLReader();
                $reader->open($path_to_xml_file);
                while($reader->read())
                {
                        if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'item')
                        {
                                        $doc = new DOMDocument('1.0', 'UTF-8');
                                        $xml = simplexml_import_dom($doc->importNode($reader->expand(),true));
                                        //echo $xml->title; //or whatever

// Take care of the articles
$newcontent = $xml->children('http://purl.org/rss/1.0/modules/content/');
$contentString = $newcontent->encoded;
$titleString = $xml->title;

    echo '
    <div class="article-container" id="article-' .  $xml->title . '">
    <a href="#top" class="top-link">Back to the Top</a>
        <h2>' .  $xml->title . '</h2>
        <div class="articles">' . $newcontent->encoded . '</div>
    </div>';
                        }
                }

?>

I was able to successfully check this with just simpleXML but, it was too much of a memory hog all by itself. This was my simplexml code:

<?php 

    $url = 'wordpress.2011.xml.gz';
    $xml = new SimpleXMLElement("compress.zlib://$url", NULL, TRUE);

    foreach ($xml->item as $item) :

    $newcontent = $item->children('http://purl.org/rss/1.0/modules/content/');

    ?>

<?php
$contentString = $newcontent->encoded;
$titleString = $item->title;

if ((strlen($contentString) < 13) || (strlen($titleString) < 5))  {
    echo '';
} else {
    echo '
    <div class="article-container" id="article-' .  $item->title . '">
    <a href="#top" class="top-link">Back to the Top</a>
        <h2>' .  $item->title . '</h2>
        <div class="articles">' . $newcontent->encoded . '</div>
    </div>';
}
?>



 <?php endforeach; ?>

UPDATE

With Francis’ help, it is working now. Here is the code:

<?php 

$path_to_xml_file = 'compress.zlib://wordpress.2011.xml.gz';

$reader = new XMLReader();
$reader->open($path_to_xml_file);
$contentNS = 'http://purl.org/rss/1.0/modules/content/';
while($reader->read()) {
    if($reader->nodeType == XMLReader::ELEMENT and $reader->name == 'item') {
        $doc = new DOMDocument('1.0','UTF-8');
        $xml = simplexml_import_dom($doc->importNode($reader->expand(), true));
        $titleString = (string) $xml->title;
        $contentString = (string) $xml->children($contentNS)->encoded;
        if (strlen($contentString) > 12 and strlen($titleString) > 4)  {
            // Be careful with your output escaping!
            // This below looks like it might be wrong:
            // - $titleString for an ID (use slug)
            // - $titleString not escaped
            // - $contentString should be escaped? not sure here.
            // Have you considered using XMLWriter()?
            echo '
<div class="article-container" id="article-' .  $titleString . '">
    <a href="#top" class="top-link">Back to the Top</a>
    <h2>' .  $titleString . '</h2>
    <div class="articles">' . $contentString . '</div>
</div>';
        } else {

        echo'';

        }

        $reader->next(); //skip the subtrees, go to next item sibling
        // we already expand()ed this so we don't need to walk it.
    }
}

?>
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T13:25:28+00:00Added an answer on May 27, 2026 at 1:25 pm

    When you say $contentString = $newcontent->encoded, the type of $contentString is not string but SimpleXMLElement. Thus strlen() is returning something nonsensical.

    You need to explicitly cast SimpleXMLElements to string to get the text value of the element:

    $contentString = (string) $newcontent->encoded;
    

    As an aside, you can simplify your DOM expansion and conversion to SimpleXMLElement by using the optional argument to XMLReader::expand():

    $sxe = simplexml_import_dom($reader->expand(new DOMDocument('1.0','UTF-8')));
    

    EDIT with a complete example of your first code block written to do what you want (I think?) As you can see all I did was take the inner loop from your second code example and put it in the inner loop in your first code example.

    $reader = new XMLReader();
    $reader->open($path_to_xml_file);
    $contentNS = 'http://purl.org/rss/1.0/modules/content/';
    while($reader->read()) {
        if($reader->nodeType == XMLReader::ELEMENT and $reader->name == 'item') {
            $xml = simplexml_import_dom($reader->expand(new DOMDocument('1.0', 'UTF-8')));
            $titleString = (string) $xml->title;
            $contentString = (string) $xml->children($contentNS)->encoded;
            if (strlen($contentString) > 12 and strlen($titleString) > 4)  {
                // Be careful with your output escaping!
                // This below looks like it might be wrong:
                // - $titleString for an ID (use slug)
                // - $titleString not escaped
                // - $contentString should be escaped? not sure here.
                // Have you considered using XMLWriter()?
                echo '
    <div class="article-container" id="article-' .  $titleString . '">
        <a href="#top" class="top-link">Back to the Top</a>
        <h2>' .  $titleString . '</h2>
        <div class="articles">' . $contentString . '</div>
    </div>';
            }
            $reader->next(); //skip the subtrees, go to next item sibling
            // we already expand()ed this so we don't need to walk it.
        }
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using XMLReader to parse a file which i don't control. The file is
I'm getting this error while running tomcat 'java.lang.ClassNotFoundException: com.mysql.jdbc.Driver'. I'm using a combination of
Have seen multiple posts on this but I can't see any which answer my
I have this class where I am using a combination of jQuery and Prototype
My company develop web apps using combination of mod_perl, axkit and apache. We have
I'm currently using a combination of CSS and Div tags to achieve rounded corners
I'm using the combination of json_encode (PHP) and JSON.parser (Javascript from json.org) for passing
In the application I'm writing using a combination of development environments and languages, I
I am trying to create a simple mouseover effect using a combination of mouseover,
I am trying to make a search function using the combination of DOM, PHP

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.