Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6207237
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T05:35:41+00:00 2026-05-24T05:35:41+00:00

Dive into Python: XML Processing – Here I am referring to a portion of

  • 0

Dive into Python: XML Processing –

Here I am referring to a portion of kgp.py program –

def getDefaultSource(self):
  xrefs = {}
  for xref in self.grammar.getElementsByTagName("xref"):
    xrefs[xref.attributes["id"].value] = 1
  xrefs = xrefs.keys()
  standaloneXrefs = [e for e in self.refs.keys() if e not in xrefs]
  if not standaloneXrefs:
    raise NoSourceError, "can't guess source, and no source specified"
  return '<xref id="%s"/>' % random.choice(standaloneXrefs)

self.grammar: parsed XML representation (using xml.dom.minidom) of –

<?xml version="1.0" ?>
<grammar>
<ref id="bit">
  <p>0</p>
  <p>1</p>
</ref>
<ref id="byte">
  <p><xref id="bit"/><xref id="bit"/><xref id="bit"/><xref id="bit"/>\
<xref id="bit"/><xref id="bit"/><xref id="bit"/><xref id="bit"/></p>
</ref>
</grammar>

self.refs: is the caching of all the refs of the above XML key’d by their id


I have two doubts with this code:

Doubt 1:

  for xref in self.grammar.getElementsByTagName("xref"):
    xrefs[xref.attributes["id"].value] = 1
  xrefs = xrefs.keys()

eventaully xrefs holds the id values in a list. Couldn’t we have done this simply by –

  xrefs = [xref.attributes["id"].value 
           for xref in self.grammar.getElementsByTagName("xref")]

Doubt 2:

  standaloneXrefs = [e for e in self.refs.keys() if e not in xrefs]
  ...
  return '<xref id="%s"/>' % random.choice(standaloneXrefs)

Here, we are saving the ref from self.refs which we do NOT see in our computed xrefs. But next instead of creating a <ref> element, we are creating a <xref> with the same ID. This takes us one step backward, since later we are anyway going to find the cross reference for this computed <xref> and eventually reach the <ref>. We could have just started with this <ref> in the first place.


Disclaimer

I am in no way trying to make a remark on the book. I am not even qualified for that.

I am loving every moment of reading this book. I realize few chapters have gone outdated, but I love Mark Pilgrim’s writing style and I cannot stop reading.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T05:35:42+00:00Added an answer on May 24, 2026 at 5:35 am

    Dive Into Python is seven years old now (published 2004), and doesn’t always contain the most modern code. So you need to go easy on it: Dive Into Python 3 might be a better bet.

    Your suggestion for doubt 1 changes the meaning of the code: putting the ids into the keys of a dictionary and then getting them out again eliminates duplicates, whereas your list comprehension includes duplicates. The modern approach would be to use a set comprehension:

     xrefs = {xref.attributes["id"].value 
              for xref in self.grammar.getElementsByTagName("xref")}
    

    but this wasn’t available in 2004.

    On your doubt 2, I’m not entirely sure I see the problem. Yes, in some sense this is a waste, but on the other hand the code already has a handler for the xref case, so it makes sense to re-use that handler rather than add an extra special case.

    There are several other bits of code in that example that could be modernized. For example,

    source and source or self.getDefaultSource()
    

    would now be source or self.getDefaultSource(). And the line

    standaloneXrefs = [e for e in self.refs.keys() if e not in xrefs]
    

    would be better expressed as a set difference operation, something like:

    standaloneXrefs = set(self.refs) - set(xrefs)
    

    But that’s what happens as languages become more expressive: old code starts to look rather inelegant.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Dive into Python: Scripts and Streams - class KantGenerator: def _load(self, source): sock =
Dive into Python: HTTP Web Services - class DefaultErrorHandler(urllib2.HTTPDefaultErrorHandler): def http_error_default(self, req, fp, code,
I absolutely loved Dive Into Python when I picked up Python. In fact, tutorials
I'm currently reading Dive Into Python by Mark Pilgrim, and have gotten to the
Is there an online Java book like Dive into Python for learning Python? Other
I'm currently reading chapter 5.8 of Dive Into Python and Mark Pilgrim says: There
From Dive into Python: Class attributes are available both through direct reference to the
I am learning Python using Dive Into Python 3 book. I like it, but
I'm very new to Python, sort of following Dive into Python 2 and wanted
I've read in "Dive into Python 3" that: "The readlines() method now returns an

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.