I’m have a document A and want to build a new one B using

Question

0

Editorial Team

Asked: May 27, 20262026-05-27T08:51:11+00:00 2026-05-27T08:51:11+00:00

I’m have a document A and want to build a new one B using

0

I’m have a document A and want to build a new one B using A‘s node values.

Given A looks like this…

<html>
  <head></head>
  <body>
    <div id="section0">
      <h1>Section 0</h1>
      <div>
        <p>Some <b>important</b> info here</p>
        <div>Some unimportant info here</p>
      </div>
    <div>
    <div id="section1">
      <h1>Section 1</h1>
      <div>
        <p>Some <i>important</i> info here</p>
        <div>Some unimportant info here</div>
      </div>
    <div>
  </body>
</html>

When building a B document, I’m using method a.at_css("#section#{n} h1").text to grab the data from A‘s h1 tags like this:

require 'nokogiri'

a = Nokogiri::HTML(html)

Nokogiri::HTML::Builder.new do |doc|
  ...
  doc.h1 a.at_css("#section#{n} h1").text
  ...
end

So there are three questions:

How do I grab the content of  tags preserving tags inside
?

Currently, once I hit a.at_css("#section#{n} p").text it
returns a plain text, which is not what’s needed.

If, instead of .text I hit .to_html or .inner_html, the html appears escaped. So I get, for example,  instead of .
Is there any known true way of assigning nodes at the document building stage? So that I wouldn’t dance with text method at all? I.e. how do I assign doc.h1 node with value of a.at_css("#section#{n} h1") node at building stage?
What’s the profit of Nokogiri::Builder.with(...) method? I wonder if I can get use of it…

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T08:51:12+00:00

How do I grab the content of  tags preserving tags inside ?

Use .inner_html. The entities are not escaped when accessing them. They will be escaped if you do something like builder.node_name raw_html. Instead:

require 'nokogiri'
para = Nokogiri.HTML( '<p id="foo">Hello <b>World</b>!</p>' ).at('#foo')

doc = Nokogiri::HTML::Builder.new do |d|
  d.body do
    d.div(id:'content') do
      d.parent << para.inner_html
    end
  end
end

puts doc.to_html
#=> <body><div id="content">Hello <b>World</b>!</div></body>

Is there any known true way of assigning nodes at the document building stage?

Similar to the above, one way is:
```
puts Nokogiri::HTML::Builder.new{ |d| d.body{ d.parent << para } }.to_html
#=> <body>Hello World!</body>
```
Voila! The node has moved from one document to the other.
What’s the profit of Nokogiri::Builder.with(...) method?

That’s rather unrelated to the rest of your question. As the documentation says:

Create a builder with an existing root object. This is for use when you have an existing document that you would like to augment with builder methods. The builder context created will start with the given root node.

I don’t think it would be useful to you here.

In general, I find the Builder to be convenient when writing a large number of custom nodes from scratch with a known hierarchy. When not doing that you may find it simpler to just create a new document and use DOM methods to add nodes as appropriate. It’s hard to tell how much hard-coded nodes/hierarchy your document will have versus procedurally created.

One other, alternative suggestion: perhaps you should create a template XML document and then augment that with details from the other, scraped HTML?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m have a document A and want to build a new one B using

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply