Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7062941
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T04:39:16+00:00 2026-05-28T04:39:16+00:00

Am using Word Interop adn C# to build a program at work and one

  • 0

Am using Word Interop adn C# to build a program at work and one of the features in it is getting a word count.

Now this can’t be the Word word count as i need to emulate the word count of a CAT toool used at work.

One of the issues i found is that the CAT tool uses text formatting to split up words. This means that if i have the word 1st with st superscripted, word will count one word (as there is nothing separating the two) and the CAT tool counts 2 words as per the text format change.

Thing is the CAT tool keeps track of the format changes and that information breaks the word.

So, i could go word by word, character by character, and check all possibilities (font, bold, italic, etc) but that would be really slow working with multiple documents each with 1000s of words.

Does anyone know a better solution?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T04:39:17+00:00Added an answer on May 28, 2026 at 4:39 am

    Well Cindy from the MSDN forums gave me the answer on this one

    http://social.msdn.microsoft.com/Forums/en-US/worddev/thread/16fc1fb9-4713-45e5-ae00-76bbaafe0a56

    then the approach I’d look at would be to use Document.Content.WordOpenXML to extract the content into a string. The content will be in the Office Open XML “flat package” format, meaning it should contain everything.

    You should then be able to “parse” the string to get the information you need.

    If you look at such a string, you should see that all the text is in elements. If there’s formatting, then it will break the into parts – one part for each formatting change. So all that you’d need to do in addition to extracting all the w:t elements would be to check for the punctuation and spaces that otherwise delineate “words” in the text.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Consider this code: using Microsoft.Office.Interop.Word; ApplicationClass _application = new ApplicationClass(); Can I get the
I am using Interop.Microsoft.Office.Interop.Word.dll to dynamically build a Word document in C#. Does anyone
We are developing C#.Net 4.0 Windows form based application using Microsoft.Office.Interop.Word reference. Now I
I'm creating a new instance of Word using the Office interop by doing this:
Using .NET and the Word Interop I am programmatically creating a new Word doc
I'm trying to add some HTML formatted text to Word using Office Interop. My
I'm trying to start Microsoft word using QProcess as following: QString program = WINWORD.EXE;
I'm using this to open an RTF in Word and save it as a
I have this warning: Warning 3 Ambiguity between method 'Microsoft.Office.Interop.Word._Application.Quit(ref object, ref object, ref
One of my applications deals with MS Word and Document creation/editing/formatting. I am using

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.