Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3217546
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T15:25:25+00:00 2026-05-17T15:25:25+00:00

I have a tool producing NewsML type XML files and I want to validate

  • 0

I have a tool producing NewsML type XML files and I want to validate them after producing the files.
I’m receiving an error:

Attempt to load network entity http://www.w3.org/TR/ruby/xhtml-ruby-1.mod

The python call is:

parser = etree.XMLParser(load_dtd=True, dtd_validation=True)
treeObject = etree.parse(f, parser)

First I’m not sure if I need both “load_dtd=True, dtd_validation=True” but I’m using it anyway.
Second error seems to be coming from an imported nitf-3-4.dtd that’s defined as:

<!ENTITY % xhtml-ruby.mod PUBLIC 
    "-//W3C//ELEMENTS XHTML Ruby 1.0//EN" "http://www.w3.org/TR/ruby/xhtml-ruby-1.mod">
%xhtml-ruby.mod;

Will lxml go out and retrieve this xhtml-ruby-1.mod or do I have to have all the DTD files locally.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T15:25:26+00:00Added an answer on May 17, 2026 at 3:25 pm

    Try constructing the parser with no_network=False. As stated in the documentation:

    no_network – prevent network access when looking up external documents (on by default)

    Imported dtd modules should get retrieved by lxml, but it will not be able to do so if network access is not allowed (this does not count for the document itself, only for loading external referenced documents. In fact, I would expect you to get errors loading the dtd itself, so I assume the document refers to a locally available copy of that dtd, and that it is only the dtd itself that references a remote resource?)

    You could also use a catalog to use locally available copies (not only circumventing this problem, but also more performant, and friendlier towards the w3c servers ;-)). Libxml2 (used by lxml) will check for the existance of a catalog in /etc/xml/catalog, and the XML_CATALOG_FILES environment variable (see Libxml2 docs)

    (it is also possible to write your own resolvers for lxml to intercept and handle requests, but that would probably be overkill in this case)

    Note that there is also another option besides parse time validation: use the DTD class to load the dtd separately, and use that as a validator.

    This will validate the parsed document with the provided dtd regardless of which dtd (if any) is referenced by doctype declaration (which can be handy: not every valid xml file is necessarily valid according to the dtd you want).

    Because the dtd will only have to be retrieved and parsed once, this should be faster if you’re validating a lot of documents), and (if I’m not mistaken), you won’t run into the no_network problem.

    Another bonus of this approached: you can even validate your elements/elementtrees before you’ve serialized them (if your producing tool uses lxml that is).

    A final note: some documents can only be parsed if you have access to the dtd at parse time (unresolvable entities…). Avoid this if you can. (and, although not everyone would agree: avoid doctype declarations altogether if possible).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a tool which dynamically generates .xaml and .xaml.cs files and puts them
I have written a tool tip snippet, its throwing error like TOO MUCH RECURSION
I have written a tool to collect log files within a time window specified
I want to have a similar tool in Emacs as the following in Vim
I have a tool bar and I want that to be visible on next
I have a tool with a configurable delay (Timespan), and I want to set
I want to have Hibernate Tool to be used in Eclipse. Can anyone give
I have a tool that I want to install on my main development box
We have a tool that generates source files, and each generated file has several
I want to have a tool what is able to find my pictures in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.