I would use WCF because it can do everything webservices…

Question

0

Asked: May 14, 20262026-05-14T07:06:45+00:00 2026-05-14T07:06:45+00:00

I’m working with docx docs, and I need to parse a document into sections

0

I’m working with docx docs, and I need to parse a document into sections on the basis of headings styled with the “heading 1” style. So if I had a doc like this (markup is pseudocode):

<doc>
<title style>Doc Title</title style>
<heading1>First Section</heading1>
...
<heading2>Second Section</heading2>
...
<heading3>Third Section</heading3>
...
</doc>

I’d want to break this into a doc with four sections, the first being the content that precedes the first section. I figure that this is probably pretty simple once you’re familiar with Open XML, but I am not.

TIA.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T07:06:46+00:00

Wow…not even any views on this question all day. Well, I figured it out and thought I’d share the wealth. I can’t share the code directly, but it’s just three nested loops, one looping through the paragraphs, then the paragraph runs, then the styles. The XPath for each of those is:

.//w:p
./w:pPr
./w:pStyle

Once you find a run with the style you like, you pop back up a level to get the first run, which will contain the styled text. From there on, it’s just Comp Sci 101 stuff. I think the real breakthrough was to not even try to mess with the Open Xml SDK (aside from the IO Packaging stuff), and go straight to XML manipulation.

How to approach applying for a job at a company ...

What is a programmer’s life like?

How to handle personal stress caused by utterly incompetent and ...

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions