I have some RSS that looks like this:
<item>
<guid isPermaLink="false">2284767032</guid>
<title>title goes here...</title>
<description> Description </description>
<author>author name</author>
<dcterms:valid>start=2012-09-28T17:06:00Z;scheme=W3C-DTF</dcterms:valid>
<media:category scheme="" label="">cat1</media:category>
<media:category scheme="" label="">cat2</media:category>
<media:category scheme="" label="">cat3</media:category>
<media:copyright>Big Company</media:copyright>
<media:keywords>some;keywords;</media:keywords>
<media:group>
<media:content bitrate="643.386" medium="video" duration="72.144" expression="full" fileSize="5802051" framerate="29.97" type="video/x-flv" height="360" url="..." width="640"/>
<media:content bitrate="1242.571" medium="video" duration="72.144" expression="full" fileSize="11205501" framerate="29.97" type="video/x-flv" height="480" url="..." width="854"/>
</media:group>
<link>a234dfasf4f</link>
<plmedia:defaultThumbnailUrl>
http://url.jpg
</plmedia:defaultThumbnailUrl>
</item>
I’m using the following code to parse it:
$feed = simplexml_load_file('http://feedurl.com');
echo "<pre>";
print_r($feed);
echo "</pre>";
The problem is that I’m getting all the tags like guid, title, and description, but none of the media:category or media:group or something:anything show up – they are just stripped out.
How can I parse this feed without losing them?
You need to find where the namespaces are defined, and find the string that the namespaces map to. So for example if the
medianamespace maps tohttp://example.com/something:Outputs:
The result of
print_r()with SimpleXML does not always give you the full structure, but the elements are there.To get the nested elements, try something like: