I have built a webscraper with a for-loop. I don’t know why, but it

Question

0

Asked: June 9, 20262026-06-09T09:07:40+00:00 2026-06-09T09:07:40+00:00

I have built a webscraper with a for-loop. I don’t know why, but it

0

I have built a webscraper with a for-loop. I don’t know why, but it returns an url (which is what I want it to return), and then before fetching the next url in the list, it returns a NoneType object. Other than making the script slower, it’s not a big deal, if it wasn’t because I can’t get it to print more than the first url.

from BeautifulSoup import BeautifulSoup
from mechanize import Browser
br = Browser()
page = br.open("https://bdkv2.borger.dk/foa/Sider/default.aspx?fk=22&foaid=11541520")
html = page.read()
soup = BeautifulSoup(html)
link = soup.findAll('a')
kommunelink = link[21:116]
for kommune in kommunelink:
    kommuneside = br.open(kommune['href'])
    html2 = kommuneside.read()
    soup2 = BeautifulSoup(html2)
    hjemmesidelink = soup2.find('a', id='_uscAncHomesite')
    print hjemmesidelink['href']

This way my output is like this:

http://www.albertslund.dk

Traceback (most recent call last):
File "C:\Users\kba\Desktop\kommuneskraber.py", line 14, in <module>
print hjemmesidelink['href']
TypeError: 'NoneType' object has no attribute '__getitem__'

I’ve tried messing around with stuff like: If variable == specific class, then print, but that doesn’t work. Example:

If hjemmesidelink['href'] == <class 'BeautifulSoup.Tag'>:
    print hjemmesidelink['href']

if hjemmesidelink.class == BeautifulSoup.Tag:
    print hjemmesidelink['href']

Any idea how it should be? Or maybe even better, any idea where/why my script fetches a ‘NoneType’ object every second time it iterates through the loop? Thanks a bunch.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T09:07:41+00:00

this is not a complete answer, but if you look at the comments this will answer just the part about not producing an error.

at this part of the code:

print hjemmesidelink['href']

replace with:

if hjemmesidelink:
    print hjemmesidelink['href']

the if hjemmesidelink: checks if hjemmesidelink has a value, if it does, then it prints it, if not, it will continue the loop.

my results:

>>> 
http://www.albertslund.dk
http://www.alleroed.dk
http://www.assens.dk
http://www.ballerup.dk
http://www.billund.dk
http://www.brk.dk
http://www.brondby.dk
http://www.broenderslev.dk
http://www.dragoer.dk

and counting.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have built a webscraper with a for-loop. I don’t know why, but it

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply