I am parsing an XML feed from Google using beautifulstonesoup and python, and it

Question

0

Asked: May 29, 20262026-05-29T08:22:31+00:00 2026-05-29T08:22:31+00:00

I am parsing an XML feed from Google using beautifulstonesoup and python, and it

0

I am parsing an XML feed from Google using beautifulstonesoup and python, and it works great. I am also creating a csv and uploading it to Google Docs, which works fine as well. The problem is when I come across an empty text attribute in the xml, the parser just stops. It is not a problem now, because all of the attributes have data, but the first time they don’t, it will break.

The code:

import atom
import gdata.auth
import gdata.contacts
import gdata.contacts.client
import gdata.docs.service
import gdata.docs.data
from BeautifulSoup import BeautifulStoneSoup as Soup
import csv

email = 'admin@domain.com'
password = 'password'
domain = 'domain.com'

ms_client = gdata.docs.service.DocsService()
gd_client = gdata.contacts.client.ContactsClient(domain=domain)
gd_client.ClientLogin(email, password, 'profileFeedAPI')
ms_client.ClientLogin(email, password, 'peopleCSVupload')

profiles_feed = gd_client.GetProfilesFeed('https://www.google.com/m8/feeds/profiles/domain/domain.com/full?max-results=300')

soup = Soup(str(profiles_feed), selfClosingTags=['ns0:category','ns3:status', 'ns0:link','ns1:email'])

a = soup.findAll('ns0:entry')
f = open('C:\\people.csv', 'wb')

writer = csv.writer(f, quoting=csv.QUOTE_NONE, escapechar =' ')

for entry in a:
    writer.writerow([entry.find('ns1:familyname').text + ',' + entry.find('ns1:givenname').text + ',' + entry.find('ns1:fullname').text + ',' + entry.find('ns1:orgtitle').text + ',' + entry.find('ns1:orgdepartment').text + ',' + entry.find('ns1:orgname').text + ',' + entry.find('ns1:email',primary=True)['address']])

f.close()

ms = gdata.data.MediaSource(file_path="C:\\people.csv", content_type=gdata.docs.service.SUPPORTED_FILETYPES['CSV'])
csv_entry = ms_client.Upload(ms, "People File")

I know I could do this:

for entry in a:
    if entry.find('ns1:orgtitle') != None:
        print entry.find('ns1:orgtitle').text
    elif entry.find('ns1:orgtitle') == None:
        print('')
    if entry.find('ns1:familyname') != None:
        print entry.find('ns1:familyname').text
    elif entry.find('ns1:familyname') == None:
        print('')
        etc...

But it is very long, and I don’t know how to concentrate the data to appear on one row. Any help, much appreciated.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T08:22:32+00:00

Editorial Team

2026-05-29T08:22:32+00:00Added an answer on May 29, 2026 at 8:22 am

you can wrap the find like this:

def findnonempty(entry, arg):
    result = entry.find(arg):
    if result:
        return result.text
    else:
        return ""

the you can either do the 7 calls one after each other or you can use map(), like

tags = ['ns1:familyname', 'ns1:givenname', ... ] # your tags
s = map(lambda tag: findnonempty(entry, tag), tags)
"".join(s)

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am parsing an XML feed from Google using beautifulstonesoup and python, and it

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply