I have an xml file that I download from a url. I would then

Question

0

Asked: June 7, 20262026-06-07T05:56:35+00:00 2026-06-07T05:56:35+00:00

I have an xml file that I download from a url. I would then

0

I have an xml file that I download from a url. I would then like to iterate through the xml to find the link to a file with a specific file extension.

My xml looks something like this:

<Foo>
    <bar>
        <file url="http://foo.txt"/>
        <file url="http://bar.doc"/>
    </bar>
</Foo>

I’ve written code to get the xml file like this:

import urllib2, re
from xml.dom.minidom import parseString

file = urllib2.urlopen('http://foobar.xml')
data = file.read()
file.close()
dom = parseString(data)
xmlTag = dom.getElementsByTagName('file')

And then I’d ‘like’ to get somthing like this to work:

   i=0
    url = ''
    while( i < len(xmlTag)):
         if re.search('*.txt', xmlTag[i].toxml() ) is not None:
              url = xmlTag[i].toxml()
         i = i + 1;

** Some code that parses out the url **

But that throws an error. Anyone have tips on a better approach?

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T05:56:36+00:00

Your last bit of code is, frankly, disgusting. dom.getElementsByTagName('file') gives you a list of all <file> elements in the tree… just iterate over it.

urls = []
for file_node in dom.getElementsByTagName('file'):
    url = file_node.getAttribute('url')
    if url.endswith('.txt'):
        urls.append(url)

As an aside, you should NEVER have to do indexing manually with Python. Even in the rare instance you need the index number, just use enumerate:

mylist = ['a', 'b', 'c']
for i, value in enumerate(mylist):
    print i, value

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an xml file that I download from a url. I would then

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply