Here's my unit that does it. I put this together…

Question

0

Asked: May 11, 20262026-05-11T10:04:28+00:00 2026-05-11T10:04:28+00:00

I’ve been experimenting with various bits of Java code trying to come up with

0

I’ve been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and ‘exotic’ Unicode characters and produce output that’s identical to JavaScript’s encodeURIComponent function.

My torture test string is: ‘A’ B ± ‘

If I enter the following JavaScript statement in Firebug:

encodeURIComponent(''A' B ± '');

—Then I get:

'%22A%22%20B%20%C2%B1%20%22'

Here’s my little test Java program:

import java.io.UnsupportedEncodingException; import java.net.URLEncoder;  public class EncodingTest {   public static void main(String[] args) throws UnsupportedEncodingException   {     String s = '\'A\' B ± \'';     System.out.println('URLEncoder.encode returns '       + URLEncoder.encode(s, 'UTF-8'));      System.out.println('getBytes returns '       + new String(s.getBytes('UTF-8'), 'ISO-8859-1'));   } }

—This program outputs:

URLEncoder.encode returns %22A%22+B+%C2%B1+%22 getBytes returns 'A' B ± '

Close, but no cigar! What is the best way of encoding a UTF-8 string using Java so that it produces the same output as JavaScript’s encodeURIComponent?

EDIT: I’m using Java 1.4 moving to Java 5 shortly.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T10:04:29+00:00

Looking at the implementation differences, I see that:

MDC on encodeURIComponent():

literal characters (regex representation): [-a-zA-Z0-9._*~'()!]

Java 1.5.0 documentation on URLEncoder:

literal characters (regex representation): [-a-zA-Z0-9._*]
the space character ' ' is converted into a plus sign '+'.

So basically, to get the desired result, use URLEncoder.encode(s, 'UTF-8') and then do some post-processing:

replace all occurrences of '+' with '%20'
replace all occurrences of '%xx' representing any of [~'()!] back to their literal counter-parts

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions