Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6012839
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T02:27:43+00:00 2026-05-23T02:27:43+00:00

I have a xml file with a malformed HTML in its content .. Since

  • 0

I have a xml file with a malformed HTML in its content ..
Since xml cannot parse html tags like <br> I have used CDATA for saving and parsing .

I have used documentBuilder.setCoalescing(true) ; while parsing for recovering data <![CDATA[<br>test<br>data<br>]]> without CDATA tag ..

but in the optput < and > tags are replaced by &lt; and &gt; respectively ..

I m expecting this string in result …

<br>test<br>data<br>

in the parsed string .

How to do this ? Any Idea ?
Thanks in advance !

UPDATE:I have two more Questions in follow up ..

1.Is there any way to make a malformed HTML (eg.<br>) to parsable xml (eg.<br/>) via code , if so will it handle &nbsp; also ?

2.Is there any solution to convert a html text to plain text via java (eg.<div>test&nbsp;text</div> to test text)?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T02:27:44+00:00Added an answer on May 23, 2026 at 2:27 am

    Coalescing is an operation where the contents of CDATA sections (nodes) are converted to text nodes and merged with the contents of adjacent text nodes. This requirement in itself of converting CDATA sections to text nodes will impose the restriction that the resulting text nodes be composed of valid XML characters. This will preserve original document formatting; in other words, the structure of the nodes in the original document will not undergo a change.

    The resulting behavior is that of the 5 predefined entities – <, >, &, " and ', the first three will be expanded, for their unaltered presence will change document structure.

    In short, you cannot do what you intend to do, by extracting values from the DOM. You’ll need to decode the values into what you desire, after parsing the document. Apache Commons Lang has a utility class – StringEscapeUtils that possesses the desired method.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i want parse xml file, which does't have xml extension, like this: http://bizonek.wrzuta.pl/xml/plik/1ANdXCgTOit/unknow/undefined/643/ my
hi i have xml file whitch i want to parse, it looks something like
What is the easiest way to convert xml to html? I have xml file
I have an XML file that starts like this: <Elements name=Entities xmlns=XS-GenerationToolElements> I'll have
I have an XML file, which I open in F# like this: let Bookmarks(xmlFile:string)
I have a XML File like that <?xml version=1.0 encoding=utf-8 ?> <Configurations> <EmailConfiguration> <userName>xxxx</userName>
I have an xml file like this: <root> <item> <name>one</name> <status>good</status> </item> <item> <name>two</name>
i have xml file which contains CDATA i need to update the CDATA as
We have XML file like below... <?xml version='1.0'?> <T0020 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:schemaLocation=http://www.safersys.org/namespaces/T0020V1 T0020V1.xsd xmlns=http://www.safersys.org/namespaces/T0020V1> <IRP_ACCOUNT>
I have xml file like this <?xml version=1.0 encoding=UTF-8?> <specification xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation=file://Desktop/normal.xsd> <university> <refstr>bdvl_te_skrm_stc</refstr>

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.