Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8774267
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T18:31:16+00:00 2026-06-13T18:31:16+00:00

Sorry to double post. But my earlier post was based on Flex: Flex TextArea

  • 0

Sorry to double post. But my earlier post was based on Flex:

Flex TextArea – copy/paste from Word – Invalid unicode characters on xml parsing

But now I’m posting this on the Java side.

The issue is:

We have an email functionality (part of our application) where we create an XML string & put it on the queue. Another application picks it up, parses the XML & sends out emails.

We get an XML parser exception when the email text (<BODY>....</BODY) is copy/pasted from Word:

Invalid character in attribute value BODY (Unicode: 0x1A)

As we use Java as well, I’m trying to remove the invalid characters from the String using:

body = body.replaceAll("‘", "");
body = body.replaceAll("’", "");

//Strip invalid characters

public String stripNonValidXMLCharacters(String in) {
        StringBuffer out = new StringBuffer(); // Used to hold the output.
        char current; // Used to reference the current character.

        if (in == null || ("".equals(in))) {
            return ""; // vacancy test.
        }
        for (int i = 0; i < in.length(); i++) {
            //NOTE: No IndexOutOfBoundsException caught here; it should not happen.
            current = in.charAt(i); 
            if ((current == 0x9) 
                    || (current == 0xA) 
                    || (current == 0xD) 
                    || ((current >= 0x20) && (current <= 0xD7FF)) 
                    || ((current >= 0xE000) && (current <= 0xFFFD)) 
                    || ((current >= 0x10000) && (current <= 0x10FFFF)))
                out.append(current);
        }
        return out.toString();
    }

//Strip once more

private String stripNonValidXMLCharacter(String in) {      
        if (in == null || ("".equals(in))) { 
            return null;
        }
        StringBuffer out = new StringBuffer(in);
        for (int i = 0; i < out.length(); i++) {
            if (out.charAt(i) == 0x1a) {
                out.setCharAt(i, '-');
            }
        }
        return out.toString();
    }

//Replace the special characters if any

 emailText = emailText.replaceAll("[\\u0000-\\u0008\\u000B\\u000C" 
                        + "\\u000E-\\u001F" 
                        + "\\uD800-\\uDFFF\\uFFFE\\uFFFF\\u00C5\\u00D4\\u00EC"
                        + "\\u00A8\\u00F4\\u00B4\\u00CC\\u2211]", " ");
            emailText = emailText.replaceAll("[\\x00-\\x1F]", "");
            emailText = emailText.replaceAll(
                                    "[\\x00-\\x08\\x0b\\x0c\\x0e-\\x1f]", "");
            emailText = emailText.replaceAll("\\p{C}", "");

But they still do not work. Also the XML string starts with:

 <?xml version="1.0" encoding="UTF-8"?>  
                    <EMAILS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNameSpaceSchemaLocation=".\\SMTPSchema.xsd\">

I think the issue occurs when there are multiple Tabs in the Word doc. Like for eg.

Text......text
<newLine>
<tab><tab><tab> text...text
<newLine>

The resulting xml string is:

<?xml version="1.0" encoding="UTF-8"?> <EMAILS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNameSpaceSchemaLocation=".\SMTPSchema.xsd"> <EMAIL SOURCE="t@t.com" DEST="t@t.com" CC="" BCC="t@t.com" SUBJECT="test 61" BODY="As such there was no mechanism constructed to migrate the enrollment user base to Data Collection or to keep security attributes for common users in sync between the two systems.  The purpose of this document is to outline two strategies for bring the user base between the two applications into sync.?  It still is the same.  ** Please note: This e-mail message was sent from a notification-only address that cannot accept incoming e-mail. Please do not reply to this message."/> </EMAILS>

Please note then the “?” is where there are multiple tabs in the Word doc. Hope my question is clear & someone can help in resolving the issue

Thanks

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T18:31:18+00:00Added an answer on June 13, 2026 at 6:31 pm

    The invalid (hidden) character was from the UI (Flex TextArea). So had to take care of that in the UI so that it does not pass over to Java as well. Handled & removed it using the chagingHandler in the Flex textArea to restrict the characters.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Sorry to double my earlier question, but I thought to ask specific data which
Sorry for the double post, I will update this question if I can't get
Sorry to be editing this once again, but rather than make another post, I
I am sorry if this question is double somewhere, I've searched but did not
Possible Duplicate: Difference between single quotes and double quotes in Javascript Sorry guys, but
Sorry for the double post: a friend of mine asked this on the mailing
I searched the forum but didn't find the right answer sorry if I double
Sorry, for double posting, already posted this question once, but I realized I weren't
I'm sorry for asking a double question, but I'm messed up with finding no
Sorry for question but I can't find answer anywhere on internet. I couldn't find

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.