Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 371415
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T14:06:44+00:00 2026-05-12T14:06:44+00:00

I’ve got a legacy app just starting to misbehave, for whatever reason I’m not

  • 0

I’ve got a legacy app just starting to misbehave, for whatever reason I’m not sure. It generates a bunch of HTML that gets turned into PDF reports by ActivePDF.

The process works like this:

  1. Pull an HTML template from a DB with tokens in it to be replaced (e.g. “~CompanyName~”, “~CustomerName~”, etc.)
  2. Replace the tokens with real data
  3. Tidy the HTML with a simple regex function that property formats HTML tag attribute values (ensures quotation marks, etc, since ActivePDF’s rendering engine hates anything but single quotes around attribute values)
  4. Send off the HTML to a web service that creates the PDF.

Somewhere in that mess, the non-breaking spaces from the HTML template (the  s) are encoding as ISO-8859-1 so that they show up incorrectly as an “” character when viewing the document in a browser (FireFox). ActivePDF pukes on these non-UTF8 characters.

My question: since I don’t know where the problem stems from and don’t have time to investigate it, is there an easy way to re-encode or find-and-replace the bad characters? I’ve tried sending it through this little function I threw together, but it turns it all into gobbledegook doesn’t change anything.

Private Shared Function ConvertToUTF8(ByVal html As String) As String
    Dim isoEncoding As Encoding = Encoding.GetEncoding("iso-8859-1")
    Dim source As Byte() = isoEncoding.GetBytes(html)
    Return Encoding.UTF8.GetString(Encoding.Convert(isoEncoding, Encoding.UTF8, source))
End Function

Any ideas?

EDIT:

I’m getting by with this for now, though it hardly seems like a good solution:

Private Shared Function ReplaceNonASCIIChars(ByVal html As String) As String
    Return Regex.Replace(html, "[^\u0000-\u007F]", " ")
End Function
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T14:06:44+00:00Added an answer on May 12, 2026 at 2:06 pm

    Somewhere in that mess, the non-breaking spaces from the HTML template (the  s) are encoding as ISO-8859-1 so that they show up incorrectly as an “” character

    That’d be encoding to UTF-8 then, not ISO-8859-1. The non-breaking space character is byte 0xA0 in ISO-8859-1; when encoded to UTF-8 it’d be 0xC2,0xA0, which, if you (incorrectly) view it as ISO-8859-1 comes out as " ". That includes a trailing nbsp which you might not be noticing; if that byte isn’t there, then something else has mauled your document and we need to see further up to find out what.

    What’s the regexp, how does the templating work? There would seem to be a proper HTML parser involved somewhere if your   strings are (correctly) being turned into U+00A0 NON-BREAKING SPACE characters. If so, you could just process your template natively in the DOM, and ask it to serialise using the ASCII encoding to keep non-ASCII characters as character references. That would also stop you having to do regex post-processing on the HTML itself, which is always a highly dodgy business.

    Well anyway, for now you can add one of the following to your document’s <head> and see if that makes it look right in the browser:

    • for HTML4: <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
    • for HTML5: <meta charset="utf-8">

    If you’ve done that, then any remaining problem is ActivePDF’s fault.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
I've got a string that has curly quotes in it. I'd like to replace
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
i got an object with contents of html markup in it, for example: string
I need a function that will clean a strings' special characters. I do NOT
I'm working with an upstream system that sometimes sends me text destined for HTML/XML
Is it possible to replace javascript w/ HTML if JavaScript is not enabled on
I have just tried to save a simple *.rtf file with some websites and
That's pretty much it. I'm using Nokogiri to scrape a web page what has
For some reason, after submitting a string like this Jack’s Spindle from a text

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.