What Is URL (Uniform Resource Identifier) Encoding?
Definition
URL encoding is an encoding format used in URLs. The standard allows the use of arbitrary data inside a Uniform Resource Identifier (a URI; typically a URL) while using only a narrow set of US-ASCII characters. The encoding exists because URLs and HTTP request parameters often contain characters (or other data) that cannot be represented with the limited set of US-ASCII characters (i.e. control characters, etc.).
Reserved and unreserved characters
In general, a URI can contain characters that are either reserved or unreserved. Unreserved characters are characters that have no special meaning; they can be displayed as-is and require no special handling. These include uppercase and lowercase letters (A-Z, a-z), decimal digits (0-9), hyphen (-), period (.), underscore (_), and tilde (~).
Reserved characters, on the other hand, are characters that may delimit the URI into sub-components: characters such as / # & and others. The following is the list of all reserved characters: ! # $ & ' ( ) * + , / : ; = ? @ [ ].
We cannot use reserved character as-is, because this would create ambiguous URIs. For instance, consider URL http://example.com/foo#bar. Does this URL point to an anchor #bar inside resource /foo, or does it point to a resource /foo#bar, that is, a resource whose name contains character #? Without URL encoding it would be impossible to tell.
We resolve such ambiguities by encoding reserved characters differently when used as data. When used as delimiters, we encode them as-is.
Percent encoding
To encode reserved characters, we use the percent-encoding scheme. In percent-encoding, each byte is encoded as a character triplet that consists of the percent character % followed by the two hexadecimal digits that represent the byte numeric value. For instance, %23 is the percent-encoding for the binary octet 00100011, which in US-ASCII corresponds to the character #. Strictly speaking, while the percent character (%) isn't reserved, it nonetheless serves as a special indicator for percent-encoded bytes (and therefore requires special handling). Simply put: it must also be percent-encoded (as %25).
So with percent-encoding, we know that URL http://example.com/foo#bar points to an anchor bar inside resource /foo while http://example.com/foo%23bar points to resource /foo#bar where character # is encoded as %23.

Other characters
Percent encoding is also used to represent other characters that are neither reserved nor unreserved. As an example, imagine a GET request containing a non-ASCII string parameter, such as a search query zajec in jež which is Slovenian for a rabbit and a hedgehog.
In such cases, we have to first encode non-ASCII characters as UTF-8 and then encode each byte of the new string with percent-encoding. So if we send a GET request to the DuckDuckGo search engine containing search query zajec in jež, we generate the following URL: https://duckduckgo.com/?q=zajec%20in%20je%C5%BE
Encoding the space character
You may have seen cases where the space character was encoded as character +. However, the percent-encoding suggests it should be encoded as %20 (in US-ASCII, the space character is 20 hexadecimal or 32 decimal). So what is going on?
Such encodings are typically created by HTML forms. When a user submits an HTML form, the data is URL-encoded using an early version of the URI percent-encoding rules that contained a number of modifications such as replacing spaces with + and others.
Note however, that using the + instead of %20 is valid only when encoding the application/x-www-form-urlencoded content, such as the query part of an URL. To make this clearer, consider the following cases.
http://www.example.com/search+script.php?search+query=search+termIn this URL, the resource being requested is
search+script.php(the plus character (+) is part of the filename), while the parameter name issearch queryand its value issearch term–. In the name of the query parameter and in its value the+sign is converted tospacewhile, in the name of the resourcesearch+script.php, the+sign remains.http://www.example.com/search+script.php?search%20query=search%20termThis case is identical to the example above. The difference—using
%20instead of the+sign in parameter name and value—is only superficial. Both URLs point to the same resource,search+script.php, and they contain the same parameters.http://www.example.com/search%20script.php?search%20query=search%20termThis example, however, is different. Here the resource name contains the actual
spacecharacter, so the name of the requested resource issearch script.php. The request parameter names and values remain the same as above. Consequently this URL is different from those above.
A URL encoder
The application below performs URL encoding and decoding on arbitrary strings. Feel free to test it out (HTML).
Input <br>
<input type="text" name="input" id="input"><br><br>
Output <br>
<input type="text" name="encoded" id="encoded">
<script>
let input = null;
let encoded = null;
document.addEventListener("DOMContentLoaded", () => {
input = document.querySelector("#input");
input.onkeyup = encode;
encoded = document.querySelector("#encoded");
encoded.onkeyup = decode;
});
function encode(event) {
encoded.value = encodeURIComponent(input.value);
}
function decode(event) {
try {
input.value = decodeURIComponent(encoded.value);
} catch (error) {
input.value = "Invalid URI string";
}
}
</script>