I’m a newbie to Python and programming in general so please excuse me if

Question

0

Asked: May 29, 20262026-05-29T23:47:37+00:00 2026-05-29T23:47:37+00:00

I’m a newbie to Python and programming in general so please excuse me if

0

I’m a newbie to Python and programming in general so please excuse me if the question is very dumb.

I’ve been following this tutorial on RSS scraping step by step but I am getting a “list index out of range” error from Python when trying to gather the corresponding links to the titles of the articles being gathered.

Here is my code:

from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re

source  = urlopen('http://feeds.huffingtonpost.com/huffingtonpost/raw_feed').read()

title = re.compile('<title>(.*)</title>')
link = re.compile('<link>(.*)</link>')

find_title = re.findall(title, source)
find_link = re.findall(link, source)

literate = []
literate[:] = range(1, 16)

for i in literate:
    print find_title[i]
    print find_link[i]

It executes fine when I only tell it to retrieve titles, but immediately throws an index error when I would like to retrieve titles and their corresponding links.

Any assistance will be greatly appreciated.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T23:47:39+00:00

I think you are using a wrong regex for extracting link from your page.

>>> link = re.compile('<link rel="alternate" type="text/html" href=(.*)')
>>> find_link = re.findall(link, source)
>>> find_link[1].strip()
'"http://www.huffingtonpost.com/andrew-brandt/the-peyton-predicament-pa_b_1271834.html" />'
>>> len(find_link)
15
>>>

Take a look at html source of your page you will find that the links are not enclosed in
<link></link> pattern.

Actually the pattern is <link rel="alternate" type="text/html" href= links here.

That’s the reason why your regex is not working.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m a newbie to Python and programming in general so please excuse me if

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply