Here is, how I encode URLs in Java.
- Split URL into structural parts. Use java.net.URL for it.
- Encode each part properly
- Use
IDN.toASCII(putDomainNameHere)
to Punycode encode the host name! - Use
java.net.URI.toASCIIString()
to percent-encode, NFC encoded unicode – (better would be NFKC!). For more info see: How to encode properly this URLURL url= new URL("http://search.barnesandnoble.com/booksearch/first book.pdf); URI uri = new URI(url.getProtocol(), url.getUserInfo(), IDN.toASCII(url.getHost()), url.getPort(), url.getPath(), url.getQuery(), url.getRef()); String correctEncodedURL=uri.toASCIIString(); System.out.println(correctEncodedURL);
Prints
http://search.barnesandnoble.com/booksearch/first%20book.pdf