Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8846181
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T11:55:00+00:00 2026-06-14T11:55:00+00:00

Ran into some strange behavior with both Loofah and Sanitize while trying to clean

  • 0

Ran into some strange behavior with both Loofah and Sanitize while trying to clean up some html fragments when I noticed that times like “6:30pm” were turning into “30pm”.

Did some investigation and found the following:

Loofah.scrub_fragment("<span>asdfasdf 6:30 pm</span>", :strip).to_html
#=> "<span>asdfasdf 30 pm</span>"
Loofah.scrub_fragment("6:30 pm", :strip).to_html
#=> "6:30 pm"
Loofah.scrub_fragment("<foo>asdfasdf 6&#58;30 pm</foo>", :strip).to_html
#=> "asdfasdf 6:30 pm"
Loofah.scrub_fragment("bar:30 pm", :strip).to_html
#=> "bar:30 pm"
Loofah.scrub_fragment("<span>bar:30 pm</span>", :strip).to_html
#=> "<span>30 pm</span>"
Loofah.scrub_fragment("<span>bar: asdfasdfadsf pm</span>", :strip).to_html
#=> "<span>bar: asdfasdfadsf pm</span>"

This is the case with all the variants of Loofah (:prune etc) and of Sanitize, so I’m assuming it’s a matter of code common to both of them. Is there anything special I need to be doing to escape colons in the code before sanitizing?

Edit 1
I realize I neglected to mention that I’m using jruby ( jruby 1.7.0 (1.9.3p203) ). I’m trying to sort out if perhaps there may be an issue in nokogiri (Which underlies both of these gems?)

Edit 2
With some further digging, it looks like MIGHT be an issue in Nokogiri on Jruby (I’m on version 1.5.5 of nokagiri, for what that’s worth). I checked out nokogiri’s fragment parser on Jruby and on Ruby 1.9.3:

Jruby 1.7.0: Unexpected results

doc = Nokogiri::HTML.fragment("<span>3:30pm</span>")
=> #(DocumentFragment:0x5fbc {
  name = "#document-fragment",
  children = [
    #(Element:0x5fc0 { name = "span", children = [ #(Text "30pm")] })]
  })

Ruby 1.9.3: Expected results

 doc = Nokogiri::HTML.fragment("<span>3:30pm</span>")
 => #(DocumentFragment:0x3fc4b102055c {
   name = "#document-fragment",
  children = [
    #(Element:0x3fc4b101fff8 {
      name = "span",
      children = [ #(Text "3:30pm")]
      })]
  })

Will try to keep digging but any suggestions are welcome.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T11:55:02+00:00Added an answer on June 14, 2026 at 11:55 am

    I believe it is a regression error in Nokogiri. I was able to replicate your problem, and tried it with several versions of Nokogiri.

    It works properly in 1.5.0:

    jruby-1.6.7.2 :002 > gem 'nokogiri', '=1.5.0'
     => true 
    jruby-1.6.7.2 :003 > require 'nokogiri'
     => true 
    jruby-1.6.7.2 :004 > doc = Nokogiri::HTML.fragment("<span>3:30pm</span>")
     => #<Nokogiri::HTML::DocumentFragment:0x7d4 name="#document-fragment" children=[#<Nokogiri::XML::Element:0x7d2 name="span" children=[#<Nokogiri::XML::Text:0x7d0 "3:30pm">]>]> 
    

    It fails in 1.5.1:

    jruby-1.6.7.2 :002 > gem 'nokogiri', '=1.5.1'
     => true 
    jruby-1.6.7.2 :003 > require 'nokogiri'
     => true 
    jruby-1.6.7.2 :004 > doc = Nokogiri::HTML.fragment("<span>3:30pm</span>")
     => #<Nokogiri::HTML::DocumentFragment:0x7d4 name="#document-fragment" children=[#<Nokogiri::XML::Element:0x7d2 name="span" children=[#<Nokogiri::XML::Text:0x7d0 "30pm">]>]> 
    

    Edit:
    It’s important to note that Nokogiri was built around the awesome libxml2 C library which is really unmatched in features, speed, and ability to handle malformed markup. The JRuby implementation is an attempt to match it using Xerces and NekoHTML. I think they have done a wonderful job making the JRuby implementation almost completely match the functionality (if not the speed) of its MRI counterpart, papering over the difference between the vastly different implementations. That being said, there are still edge cases that crop up from time to time.

    I went ahead and filed a bug report on Nokogiri.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

While trying to write some awesome JS I ran into some strange JS-behaviour. I've
While putting together a Boggle solver, I ran into some strange behavior I was
Background A while back, I ran into some behaviour that I found very strange
I ran into some trouble while creating a C-Extension for ruby that got me
Ran into this error message while trying to select some records off a table.
I ran into some strange behavior when attempting to simplify the creation of a
I'm working with a nullable DateTime object and ran into some strange behavior. Here's
I'm playing around with jQuery Mobile and ran into some (for me) strange behavior.
Ran into an odd bug while trying to play around with some perlin noise
(this is regarding the Ramaze.net framework) I ran into some really strange problems while

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.