Sunday, May 19, 2024
 Popular · Latest · Hot · Upcoming
rated 0 times [  154] [ 7]  / answers: 1 / hits: 147046  / 16 Years ago, tue, march 3, 2009, 12:00:00

I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and exotic Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function.

My torture test string is: A B ±

If I enter the following JavaScript statement in Firebug:

encodeURIComponent('A B ± ');

—Then I get:


Here's my little test Java program:


public class EncodingTest
public static void main(String[] args) throws UnsupportedEncodingException
String s = A B ± ;
System.out.println(URLEncoder.encode returns
+ URLEncoder.encode(s, UTF-8));

System.out.println(getBytes returns
+ new String(s.getBytes(UTF-8), ISO-8859-1));

—This program outputs:

URLEncoder.encode returns %22A%22+B+%C2%B1+%22
getBytes returns A B ±

Close, but no cigar! What is the best way of encoding a UTF-8 string using Java so that it produces the same output as JavaScript's encodeURIComponent?

EDIT: I'm using Java 1.4 moving to Java 5 shortly.

More From » java


Looking at the implementation differences, I see that:

MDC on encodeURIComponent():

  • literal characters (regex representation): [-a-zA-Z0-9._*~'()!]

Java 1.5.0 documentation on URLEncoder:

  • literal characters (regex representation): [-a-zA-Z0-9._*]

  • the space character is converted into a plus sign +.

So basically, to get the desired result, use URLEncoder.encode(s, UTF-8) and then do some post-processing:

  • replace all occurrences of + with %20

  • replace all occurrences of %xx representing any of [~'()!] back to their literal counter-parts

[#99895] Thursday, February 26, 2009, 16 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.

Total Points: 530
Total Questions: 90
Total Answers: 95

Location: Honduras
Member since Sun, Dec 26, 2021
2 Years ago