URL encoding in Java.

Here is, how I encode URLs in Java.

  1. Split URL into structural parts. Use java.net.URL for it.
  2. Encode each part properly
  3. Use IDN.toASCII(putDomainNameHere) to Punycode encode the host name!
  4. Use java.net.URI.toASCIIString() to percent-encode, NFC encoded unicode – (better would be NFKC!). For more info see: How to encode properly this URL
    URL url= new URL("http://search.barnesandnoble.com/booksearch/first book.pdf);
    URI uri = new URI(url.getProtocol(), url.getUserInfo(), IDN.toASCII(url.getHost()), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
    String correctEncodedURL=uri.toASCIIString(); 
    System.out.println(correctEncodedURL);

    Prints

    http://search.barnesandnoble.com/booksearch/first%20book.pdf

 

Leave a Reply