URL Encoding: What Gets Percent-Encoded and Why
A URL can only safely contain a limited set of characters. Everything else — spaces, accented letters, &, ?, #, and friends — has to be percent-encoded: replaced with a % followed by the byte's two-digit hex value. Get the rules wrong and your query parameters silently break, your & splits a value into two, or a + turns into a space where you didn't want one.
What percent-encoding actually does
Each unsafe character is converted to its UTF-8 bytes, and each byte becomes %XX in hexadecimal:
space → %20
& → %26
= → %3D
? → %3F
# → %23
é → %C3%A9 (two bytes in UTF-8)
That last one matters: non-ASCII characters can become multiple %XX pairs, because they're more than one byte in UTF-8. é isn't %E9 — it's %C3%A9.
Reserved vs unreserved characters
The URL spec (RFC 3986) splits characters into groups:
- Unreserved —
A–Z a–z 0–9 - _ . ~. These are always safe and never need encoding. - Reserved — characters with structural meaning in a URL:
: / ? # [ ] @ ! $ & ' ( ) * + , ; =. These are fine when they're doing their job (the/between path segments, the?before the query) but must be encoded when they appear inside a value.
The whole trick to URL encoding is that last point. A & between two query parameters is structure. A & inside a parameter's value (say, a company name "Tom & Jerry") must become %26, or the parser will think a new parameter started.
?company=Tom %26 Jerry ✓ value is "Tom & Jerry"
?company=Tom & Jerry ✗ parsed as company="Tom ", then a stray "Jerry"
Why spaces are sometimes %20 and sometimes +
This confuses everyone. There are two encoding contexts:
- In the path and most of a URL, a space is
%20. - In a query string using
application/x-www-form-urlencoded(the classic HTML form format), a space is+, and a literal+must be encoded as%2B.
So %20 and + can both mean "space," depending on where you are. If you build a query string by hand and your spaces come out as +, that's why — and it's correct for form-encoded data. The flip side: a real + (like in a phone number +1...) must be %2B, or it'll be read as a space.
The encodeURI vs encodeURIComponent trap
JavaScript gives you two functions, and picking the wrong one is the most common URL bug:
encodeURIComponentencodes a single value — it escapes& = ? /and the rest. Use this for each query parameter value or path segment.encodeURIencodes a whole URL — it deliberately leaves& = ? / :alone because they're structural.
encodeURIComponent("a&b=c"); // "a%26b%3Dc" ← right for a value
encodeURI("a&b=c"); // "a&b=c" ← leaves & = alone
Rule of thumb: if you're assembling a parameter value, you almost always want encodeURIComponent. Reach for encodeURI only when you have a complete URL you want to make safe without breaking its structure.
Decoding and double-encoding
Decoding reverses the process: %26 → &. Watch for double-encoding — if a % itself got encoded to %25, then %2520 is really an encoded %20, which decodes to the literal text "%20", not a space. When a value comes back looking like %2520 or Tom%2520Jerry, something encoded it twice.
For a one-off — inspecting a messy redirect URL, decoding a parameter from a log, or encoding a value to drop into a query string — paste it into the URL encoder/decoder. It runs locally in your browser and handles the %XX math for you.
Takeaways
- Unreserved characters (
A–Z a–z 0–9 - _ . ~) never need encoding; reserved characters need it inside values. - Non-ASCII becomes multiple
%XXbytes via UTF-8. - Space is
%20in paths,+in form-encoded query strings; a literal+is%2B. - Use
encodeURIComponentfor values,encodeURIfor whole URLs. - If you see
%25showing up unexpectedly, suspect double-encoding.