Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6697663
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T06:29:12+00:00 2026-05-26T06:29:12+00:00

Imagine that you have Nokogiri nodes representing the <a> elements in the following two

  • 0

Imagine that you have Nokogiri nodes representing the <a> elements in the following two documents:

<r xmlns:x="foo"><a foo="bar" jim="jam" x:oh="no"><x:b>Hello</x:b></a></r>
<r xmlns:i="foo"><a jim="jam" i:oh="no" foo="bar"><i:b>Hello</i:b></a></r>

The two are equivalent from a DOM standpoint. I’d like to detect this efficiently, but Nokogiri::XML::Node#== just checks object equality. Since Nokogiri 1.5.0 does not yet have support for canonicalization, I can’t just serialize the nodes and compare the strings.

What’s the fastest way to compare two nodes to ensure that their names, attributes, and contents are canonically equivalent?

Answers may rely on features only available in Ruby 1.9.2+, if desired.

Test Cases

ORIG1 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
ORIG2 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b jim='jam' foo='bar'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NOTEXT = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c/></a3>
</a>"
EXTRATEXT1 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foobar</a3c></a3>
</a>"
EXTRATEXT2 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b>hi</a3b><a3c>foo</a3c></a3>
</a>"
MISSINGNODE = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/></a3>
</a>"
EXTRANODE = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c><a3d/></a3>
</a>"
SWAPNODE = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3x/><a3b/><a3c>foo</a3c></a3>
</a>"
MISSINGATTRIB = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
EXTRAATTRIB1 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam' kits='meow'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
EXTRAATTRIB2 = "<a>
  <a1><a1a/><a1b/><a1c kits='meow'/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
SWAPATTRIB1 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='zzz'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
SWAPATTRIB2 = "<a>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' zzz='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NAMESPACE1 = "<a xmlns:x='foo'>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><x:a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NAMESPACE1B = "<a xmlns:z='foo'>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><z:a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NAMESPACE1C = "<a xmlns:x='bar'>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' jim='jam'/><a2c/></a2>
  <a3><x:a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NAMESPACE2 = "<a xmlns:x='foo'>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' x:jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NAMESPACE2B= "<a xmlns:z='foo'>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' z:jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"
NAMESPACE2C= "<a xmlns:x='bar'>
  <a1><a1a/><a1b/><a1c/></a1>
  <a2><a2a/><a2b foo='bar' x:jim='jam'/><a2c/></a2>
  <a3><a3a/><a3b/><a3c>foo</a3c></a3>
</a>"


require 'nokogiri'
require 'minitest/autorun'
class NodeEquivalence < MiniTest::Unit::TestCase
  def setup
    @o1 = Nokogiri::XML(ORIG1,&:noblanks).root
  end
  def test_equivalence
    o2 = Nokogiri::XML(ORIG2,&:noblanks).root
    assert @o1 =~ o2, "Equivalent nodes should be equivalent"
    assert o2 =~ @o1, "Equivalent nodes should be equivalent"
  end
  def test_textnodes
    no_text = Nokogiri::XML(NOTEXT,&:noblanks).root
    extra1  = Nokogiri::XML(EXTRATEXT1,&:noblanks).root
    extra2  = Nokogiri::XML(EXTRATEXT2,&:noblanks).root
    refute @o1 =~ no_text, "Notice missing text node child"
    refute no_text =~ @o1, "Notice missing text node child"
    refute @o1 =~ extra1,  "Notice different text in text node"
    refute extra1 =~ @o1,  "Notice different text in text node"
    refute @o1 =~ extra2,  "Notice extra text node"
    refute extra2 =~ @o1,  "Notice extra text node"
  end
  def test_nodes
    missing = Nokogiri::XML(MISSINGNODE,&:noblanks).root
    extra   = Nokogiri::XML(EXTRANODE,&:noblanks).root
    changed = Nokogiri::XML(SWAPNODE,&:noblanks).root
    refute @o1 =~ missing, "Notice missing node"
    refute missing =~ @o1, "Notice missing node"
    refute @o1 =~ extra,   "Notice extra node"
    refute extra =~ @o1,   "Notice extra node"
    refute @o1 =~ changed, "Notice renamed node"
    refute changed =~ @o1, "Notice renamed node"
  end
  def test_attributes
    missing = Nokogiri::XML(MISSINGATTRIB,&:noblanks).root
    extra1  = Nokogiri::XML(EXTRAATTRIB1,&:noblanks).root
    extra2  = Nokogiri::XML(EXTRAATTRIB2,&:noblanks).root
    swap1   = Nokogiri::XML(SWAPATTRIB1,&:noblanks).root
    swap2   = Nokogiri::XML(SWAPATTRIB2,&:noblanks).root
    refute @o1 =~ missing, "Notice missing attribute"
    refute missing =~ @o1, "Notice missing attribute"
    refute @o1 =~ extra1,  "Notice extra attribute"
    refute extra1 =~ @o1,  "Notice extra attribute"
    refute @o1 =~ extra2,  "Notice new attribute"
    refute extra2 =~ @o1,  "Notice new attribute"
    refute @o1 =~ swap1,   "Notice changed attribute value"
    refute swap1 =~ @o1,   "Notice changed attribute value"
    refute @o1 =~ swap2,   "Notice changed attribute name"
    refute swap2 =~ @o1,   "Notice changed attribute name"
  end
  def test_namespaces
    ns1  = Nokogiri::XML(NAMESPACE1,&:noblanks).root
    ns2  = Nokogiri::XML(NAMESPACE2,&:noblanks).root
    ns1b = Nokogiri::XML(NAMESPACE1B,&:noblanks).root
    ns2b = Nokogiri::XML(NAMESPACE2B,&:noblanks).root
    ns1c = Nokogiri::XML(NAMESPACE1C,&:noblanks).root
    ns2c = Nokogiri::XML(NAMESPACE2C,&:noblanks).root
    refute @o1 =~ ns1,  "Notice added node namespace"
    refute ns1 =~ @o1,  "Notice removed node namespace"
    refute @o1 =~ ns2,  "Notice added attribute namespace"
    refute ns2 =~ @o1,  "Notice removed attribute namespace"
    assert ns1 =~ ns1b, "Different namespace names on nodes don't matter"
    assert ns2 =~ ns2b, "Different namespace names on attributes don't matter"
    refute ns1 =~ ns1c, "Notice different namespace hrefs on nodes"
    refute ns2 =~ ns2c, "Notice different namespace hrefs on attributes"
  end
end
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T06:29:13+00:00Added an answer on May 26, 2026 at 6:29 am

    Here’s my current implementation. It is not now namespace aware:

    class Nokogiri::XML::Node
      # Return true if this node is content-equivalent to other, false otherwise
      def =~(other)
        return true if self == other
        return false unless name == other.name
        stype = node_type; otype = other.node_type
        return false unless stype == otype
        sa = attributes; oa = other.attributes
        return false unless sa.length == oa.length
        sa = sa.sort.map{ |n,a| [n,a.value,a.namespace && a.namespace.href] }
        oa = oa.sort.map{ |n,a| [n,a.value,a.namespace && a.namespace.href] }
        return false unless sa == oa
        skids = children; okids = other.children
        return false unless skids.length == okids.length
        return false if stype == TEXT_NODE && (content != other.content)
        sns = namespace; ons = other.namespace
        return false if !sns ^ !ons
        return false if sns && (sns.href != ons.href)
        skids.to_enum.with_index.all?{ |ski,i| ski =~ okids[i] }
      end
    end
    

    Here’s my benchmark code (using the constants from the test cases above):

    require 'benchmark'
    Benchmark.bm(10) do |x|
      N = 1000
      NODES = [
        ORIG1, ORIG2, NOTEXT, EXTRATEXT1, EXTRATEXT2,
        MISSINGNODE, EXTRANODE, SWAPNODE,
        MISSINGATTRIB, EXTRAATTRIB1, EXTRAATTRIB2, SWAPATTRIB1, SWAPATTRIB2,
        NAMESPACE1, NAMESPACE2
      ].map{ |xml| Nokogiri::XML(xml,&:noblanks).root }
      MAIN = NODES.shift
      x.report("Phrogz"){ N.times{
        NODES.each{ |other| MAIN =~ other }
      }}
    end
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Imagine that I have a form in a flash application with two fields, input1
Imagine that i have the following code: <a:repeat value=#{bean.getList()} var=x > <li class=la> <span>
Let's imagine that I have to make a table with the following structure with
Imagine that we have two tables as follows: Trades ( TradeRef INT NOT NULL,
Imagine that we have two static libraries built with different implementations of std::vector .
Imagine that you have a method with the following signature: public void DoSomething(Guid id)
Imagine that I have two profiles, one is for production, the other one is
imagine that i have a property called NextSend representing DateTime Value 4/11/2011 10:30:00 AM
imagine that I have two services with the same domain but different hostnames and
Imagine that I have something like the following (modified from http://viralpatel.net/blogs/jquery-get-text-element-without-child-element/ ) <div id=foo>

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.