I want to read a MS word document and Identify Header/Bold font words/Underscored words, etc? is there a way to solve this problem programmatically? I want the suggestion in Java or PHP or Ruby if possible, else if there is some meta-data available also let me know.
I want to read a MS word document and Identify Header/Bold font words/Underscored words,
Share
You have java API that can do that. I suggest you to look at the Apache POI library.