Yihang Ho bio photo

Yihang Ho

Coder

Twitter Github Email

It is one of those things that we use frequently, yet we don't know/care how it works.

Background

Characters like / have significant meanings in a URI. So what happens if, somehow, my query string contains such characters? This is where percent-encoding comes in. Percent-encoding is a process that converts reserved characters into something like %47. Most programming languages and libraries can handle percent-encoding (and decoding) directly. For example, in JavaScript, we have encodeURIComponent; in Ruby on Rails this is handled automagically by the link_to helper.

How it is actually done

Percent-encoding is a simple two-step process:

  1. Convert each reserved character to % follow by its ASCII value in hexadecimal. For example, ! should be translated to %21, : to %3A. Following are the list of all reserved characters:

    : / ? # [ ] @ ! $ & ' ( ) * + , ; =
    
  2. Convert each space character to +.

It is not difficult to see that a percent-encoded string can unambiguously be decoded to its original string.

Reference

  1. RFC 3986: Uniform Resource Location (URI): Generic Syntax