# URL Encoding: What Gets Percent-Encoded and Why

A URL can only safely contain a limited set of characters. Everything else — spaces, accented letters, `&`, `?`, `#`, and friends — has to be **percent-encoded**: replaced with a `%` followed by the byte's two-digit hex value. Get the rules wrong and your query parameters silently break, your `&` splits a value into two, or a `+` turns into a space where you didn't want one.

## What percent-encoding actually does

Each unsafe character is converted to its UTF-8 bytes, and each byte becomes `%XX` in hexadecimal:

```
space  → %20
&      → %26
=      → %3D
?      → %3F
#      → %23
é      → %C3%A9   (two bytes in UTF-8)
```

That last one matters: non-ASCII characters can become **multiple** `%XX` pairs, because they're more than one byte in UTF-8. `é` isn't `%E9` — it's `%C3%A9`.

## Reserved vs unreserved characters

The URL spec (RFC 3986) splits characters into groups:

- **Unreserved** — `A–Z a–z 0–9 - _ . ~`. These are always safe and never need encoding.
- **Reserved** — characters with *structural meaning* in a URL: `: / ? # [ ] @ ! $ & ' ( ) * + , ; =`. These are fine when they're doing their job (the `/` between path segments, the `?` before the query) but **must be encoded when they appear inside a value**.

The whole trick to URL encoding is that last point. A `&` between two query parameters is structure. A `&` *inside* a parameter's value (say, a company name "Tom & Jerry") must become `%26`, or the parser will think a new parameter started.

```
?company=Tom %26 Jerry      ✓ value is "Tom & Jerry"
?company=Tom & Jerry        ✗ parsed as company="Tom ", then a stray "Jerry"
```

## Why spaces are sometimes %20 and sometimes +

This confuses everyone. There are two encoding contexts:

- In the **path and most of a URL**, a space is `%20`.
- In a **query string using `application/x-www-form-urlencoded`** (the classic HTML form format), a space is `+`, and a literal `+` must be encoded as `%2B`.

So `%20` and `+` can *both* mean "space," depending on where you are. If you build a query string by hand and your spaces come out as `+`, that's why — and it's correct for form-encoded data. The flip side: a real `+` (like in a phone number `+1...`) **must** be `%2B`, or it'll be read as a space.

## The encodeURI vs encodeURIComponent trap

JavaScript gives you two functions, and picking the wrong one is the most common URL bug:

- **`encodeURIComponent`** encodes a *single value* — it escapes `& = ? /` and the rest. Use this for each query parameter value or path segment.
- **`encodeURI`** encodes a *whole URL* — it deliberately leaves `& = ? / :` alone because they're structural.

```js
encodeURIComponent("a&b=c");  // "a%26b%3Dc"   ← right for a value
encodeURI("a&b=c");           // "a&b=c"        ← leaves & = alone
```

Rule of thumb: if you're assembling a parameter value, you almost always want **`encodeURIComponent`**. Reach for `encodeURI` only when you have a complete URL you want to make safe without breaking its structure.

## Decoding and double-encoding

Decoding reverses the process: `%26` → `&`. Watch for **double-encoding** — if a `%` itself got encoded to `%25`, then `%2520` is really an *encoded* `%20`, which decodes to the literal text "`%20`", not a space. When a value comes back looking like `%2520` or `Tom%2520Jerry`, something encoded it twice.

For a one-off — inspecting a messy redirect URL, decoding a parameter from a log, or encoding a value to drop into a query string — paste it into the [URL encoder/decoder](/tools/encoding/url-encode). It runs locally in your browser and handles the `%XX` math for you.

## Takeaways

- Unreserved characters (`A–Z a–z 0–9 - _ . ~`) never need encoding; reserved characters need it **inside values**.
- Non-ASCII becomes multiple `%XX` bytes via UTF-8.
- Space is `%20` in paths, `+` in form-encoded query strings; a literal `+` is `%2B`.
- Use **`encodeURIComponent`** for values, **`encodeURI`** for whole URLs.
- If you see `%25` showing up unexpectedly, suspect double-encoding.
