Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7940137
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T23:18:01+00:00 2026-06-03T23:18:01+00:00

I need to convert an XML file in the IOB format. The XML file

  • 0

I need to convert an XML file in the IOB format.

The XML file represents the structure of a Latex-written paper, i.e. with sections and subsections. In this representation, sections are encoded as BODY, then I have a HEADER and then paragraphs or subsections.

Example:

<DIV DEPTH="1"> 
<HEADER ID="H-8"> Practical Results </HEADER>
<P TYPE="TXT"> 
<S ID="S-56" TYPE="TXT"> To assess its performance , <REF REFID="R-12" ID="C-36">Grover et al. 1993</REF> tried various methods . </S> 
<S ID="S-57" TYPE="TXT"> The grammar is defined in metagrammatical formalism which is compiled into a unification-based ` object grammar ' -- a syntactic variant of the Definite Clause Grammar formalism <REF REFID="R-21" ID="C-37">Pereira and Warren 1980</REF> -- containing 84 features and 782 phrase structure rules . </S> 
<DIV DEPTH="2"> 
<HEADER ID="H-9"> Comparing the Parsers </HEADER> 
<P TYPE="TXT"> 
<S ID="S-61" TYPE="TXT"> In the first experiment , the ANLT grammar was loaded and a set of sentences was input to each of the three parsers . </S> 
</P>
<IMAGE ID="I-0"/>
</DIV>

What I want to do is to keep all the text, but convert it to a different format, i.e. I want to remove the BODY structure, and just tag the HEADERs and the text part like this:

Practical/B-Header Results/I-Header ./O 
To/B-Text assess/I-Text its/I-Text performance/I-Text ,/I-Text Grover/I-Text et/I-Text al./I-Text tried/I-Text various/I-Text methods/I-Text ./O 
The/B-Text grammar/I-Text ... ./O

And so on.

I know some DOM parsing in Java (for example, I have been using jdom2 for a little while) but I do not know how to keep the order of the text: for example, I want to fetch the content of the REF tag (which is inside S, look at the example), but the text from its parent extends before and after the REF tag.

Any pointers? Should be fairly simple, but searches like “strip XML tags after certain depth” did not help me 🙁

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T23:18:03+00:00Added an answer on June 3, 2026 at 11:18 pm

    I would use an event based xml parser like sTax, sax etc. Then you can keep track of levels, order and other things as you process each tag.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

need to convert an xml file having a tag(dynamicVariable) that has an attribute(name).This xml
I am in a need of programatically convert an Word-XML file into a RTF
I need to convert text file to XML file through C# application. Can any
I have a project where I need to convert a XML file to CSV
I do not need to edit any XML-file or anything, this is only for
I need to convert my existing routes.ini file to an XML file (my host
I have an xml file that is like this: <xml>I need this text<notthistext>blahblah<notthistext></xml> I
I have xml file and I need to convert the text that I get
We are using XSL to convert a XML file into a pipe-delimited format. <?xml
I need to load an XML file and convert the contents into an object-oriented

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.