I am scraping 2 sets of data from a website using beautiful soup and

Question

0

Asked: June 16, 20262026-06-16T03:47:51+00:00 2026-06-16T03:47:51+00:00

I am scraping 2 sets of data from a website using beautiful soup and

0

I am scraping 2 sets of data from a website using beautiful soup and I want them to output in a csv file in 2 columns side by side. I am using spamwriter.writerow([x,y]) argument for this but I think because of some error in my recursion structure, I am getting the wrong output in my csv file. Below is the referred code:

import csv
import urllib2
import sys  
from bs4 import BeautifulSoup
page = urllib2.urlopen('http://www.att.com/shop/wireless/devices/smartphones.html').read()
soup = BeautifulSoup(page)
soup.prettify()
with open('Smartphones_20decv2.0.csv', 'wb') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')        
    for anchor in soup.findAll('a', {"class": "clickStreamSingleItem"},text=True):
        if anchor.string:
            print unicode(anchor.string).encode('utf8').strip()         

    for anchor1 in soup.findAll('div', {"class": "listGrid-price"}):
        textcontent = u' '.join(anchor1.stripped_strings)
        if textcontent:
            print textcontent
            spamwriter.writerow([unicode(anchor.string).encode('utf8').strip(),textcontent])

Output which I am getting in csv is:

Samsung FocusÂ® 2 (Refurbished) $99.99
Samsung FocusÂ® 2 (Refurbished) $99.99 to $199.99 8 to 16 GB
Samsung FocusÂ® 2 (Refurbished) $0.99
Samsung FocusÂ® 2 (Refurbished) $0.99
Samsung FocusÂ® 2 (Refurbished) $149.99 to $349.99 16 to 64 GB

Problem is I am getting only 1 device name in column 1 instead of all while price is coming for all devices.
Please pardon my ignorance as I am new to programming.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T03:47:52+00:00

You are using anchor.string, instead of archor1. anchor is the last item from the previous loop, instead of the item in the current loop.

Perhaps using clearer variable names would help avoid confusion here; use singleitem and gridprice perhaps?

It could be I misunderstood though and you want to combine each anchor1 with a corresponding anchor. You’ll have to loop over them together, perhaps using zip():

items = soup.findAll('a', {"class": "clickStreamSingleItem"},text=True)
prices = soup.findAll('div', {"class": "listGrid-price"})
for item, price in zip(items, prices):
    textcontent = u' '.join(price.stripped_strings)
    if textcontent:
        print textcontent
        spamwriter.writerow([unicode(item.string).encode('utf8').strip(),textcontent])

Normally it should be easier to loop over the parent table row instead, then find the cells within that row within a loop. But the zip() should work too, provided the clickStreamSingleItem cells line up with the listGrid-price matches.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am scraping 2 sets of data from a website using beautiful soup and

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply