I am scraping some data from a website and I am able to do

Question

0

Asked: June 16, 20262026-06-16T17:45:35+00:00 2026-06-16T17:45:35+00:00

I am scraping some data from a website and I am able to do

0

I am scraping some data from a website and I am able to do so using the below referred code:

import csv
import urllib2
import sys
import time
from bs4 import BeautifulSoup
from itertools import islice
page = urllib2.urlopen('http://shop.o2.co.uk/mobile_phones/Pay_Monthly/smartphone/all_brands').read()
soup = BeautifulSoup(page)
soup.prettify()
with open('O2_2012-12-21.csv', 'wb') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')
    spamwriter.writerow(["Date","Month","Day of Week","OEM","Device Name","Price"])
    oems = soup.findAll('span', {"class": "wwFix_h2"},text=True)
    items = soup.findAll('div',{"class":"title"})
    prices = soup.findAll('span', {"class": "handset"})
    for oem, item, price in zip(oems, items, prices):
            textcontent = u' '.join(islice(item.stripped_strings, 1, 2, 1))
            if textcontent:
                    spamwriter.writerow([time.strftime("%Y-%m-%d"),time.strftime("%B"),time.strftime("%A") ,unicode(oem.string).encode('utf8').strip(),textcontent,unicode(price.string).encode('utf8').strip()])

Now, issue is 2 of the all the price values I am scraping have different html structure then rest of the values. My output csv is showing “None” value for those because of this. Normal html structure for price on webpage is
 FREE to £79.99

For those 2 values structure is
 Up to 7 days delivery "FREE on all tariffs"

Out which I am getting right now displays None for the second html structure instead of Free on all tariffs, also price value Free on all tariffs is mentioned under double quotes in second structure while it is outside any quotes in first structure

Please help me solve this issue, Pardon my ignorance as I am new to programming.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T17:45:36+00:00

Editorial Team

2026-06-16T17:45:36+00:00Added an answer on June 16, 2026 at 5:45 pm

Just detect those 2 items with an additional if statement:

if price.string is None:
    price_text = u' '.join(price.stripped_strings).replace('"', '').encode('utf8')
else:
    price_text = unicode(price.string).strip().encode('utf8')

then use price_text for your CSV file. Note that I removed the " quotes with a simple replace call.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am scraping some data from a website and I am able to do

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply