I have a C# module that extracts information from a HTML file. But my input is a MHT file. How do I go about extracting just the html portion of the MHT file?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
I tried several tools & libraries that reportedly allowed me to extract the contents of a MHT, but almost all failed (I found that the provider of the MHT files did not encode some types correctly). I eventually discovered Total Commander which let me unpack the MHT and extract just the html portion. It was a hack, but it got the job done.
It would seem that there are many tools for creating MHTs and few for unpacking them.