URL Encoding Explained: Percent Encoding, Query Strings, and Special Characters
You've seen it: a URL with strange %20, %3D, or %E2%9C%93 sequences scattered through it. That's URL encoding — also known as percent encoding — and it's how the web safely carries spaces, accented characters, and arbitrary bytes inside URLs that were designed for plain ASCII.
Get URL encoding wrong and you'll spend hours debugging "missing parameter" bugs, double-encoded redirects, and search forms that mangle every + a user types. Get it right once and the entire web suddenly makes sense.
Why URL Encoding Exists
The original URL spec (RFC 3986) only allows a small set of "unreserved" characters in URLs:
A-Z a-z 0-9 - _ . ~
Plus a handful of "reserved" characters that have special structural meaning — /, ?, #, &, =, :, +, and a few others. Everything else — spaces, accented letters, emoji, raw binary — must be percent-encoded so it can travel through routers, proxies, and web servers without being misinterpreted.
How Percent Encoding Works
The rule is simple: take a character's UTF-8 byte representation, and write each byte as % followed by two hex digits.
| Character | UTF-8 bytes | Percent encoded |
|---|---|---|
| space | 0x20 | %20 |
! | 0x21 | %21 |
? | 0x3F | %3F |
= | 0x3D | %3D |
& | 0x26 | %26 |
é | 0xC3 0xA9 | %C3%A9 |
✓ (check mark) | 0xE2 0x9C 0x93 | %E2%9C%93 |
🔥 (fire emoji) | 0xF0 0x9F 0x94 0xA5 | %F0%9F%94%A5 |
Multi-byte UTF-8 characters become multiple %XX sequences. That's why a single emoji can take up 12 characters in a URL.
The Critical Distinction: encodeURI vs encodeURIComponent
JavaScript ships two URL-encoding functions, and getting them confused is the source of more bugs than any other URL issue:
| Function | Use for | Leaves untouched |
|---|---|---|
encodeURI() | An entire URL | : / ? # & = + $ , ; |
encodeURIComponent() | A single URL part (param value, path segment) | Only unreserved chars |
Concretely:
const url = "https://example.com/search?q=hello world";
encodeURI(url);
// → "https://example.com/search?q=hello%20world"
// (preserves : / ? = because they're URL structure)
encodeURIComponent(url);
// → "https%3A%2F%2Fexample.com%2Fsearch%3Fq%3Dhello%20world"
// (encodes EVERYTHING including colons and slashes)
Rule of thumb: encodeURI is almost never what you want. Use encodeURIComponent on each individual query parameter value, then assemble the URL. Better still: use URLSearchParams (below) and never call either function manually.
The Modern Way: URLSearchParams
Since browsers and Node.js shipped URLSearchParams, manually escaping query strings is obsolete:
const params = new URLSearchParams({
q: "hello world",
filter: "name=John&active=true",
emoji: "🔥"
});
`https://example.com/search?${params}`
// → "https://example.com/search?q=hello+world&filter=name%3DJohn%26active%3Dtrue&emoji=%F0%9F%94%A5"
Notice URLSearchParams encodes spaces as +, not %20. That's correct for query strings (this is sometimes called "form encoding") but wrong for path segments. Which brings us to…
+ vs %20: The Space Encoding Trap
Spaces have two valid encodings depending on where they appear:
%20— standard percent encoding. Valid everywhere in a URL.+— only valid in the query string portion (after the?). This convention dates back to HTML form submission (application/x-www-form-urlencoded).
// Both valid, both mean the same query
https://example.com/search?q=hello+world
https://example.com/search?q=hello%20world
// In the path, only %20 is valid
https://example.com/My%20Documents/file.txt ✓
https://example.com/My+Documents/file.txt ✗ (literal "My+Documents")
If you see + appearing as a literal plus sign in the destination — e.g., a Slack username [email protected] rendered as paul [email protected] — it's because the + got decoded as a space. The fix: encode the + itself as %2B when it's a literal value.
Encoding in Other Languages
Python
from urllib.parse import quote, quote_plus, urlencode
quote("hello world") # → "hello%20world"
quote_plus("hello world") # → "hello+world"
urlencode({"q": "hello world", "lang": "en"})
# → "q=hello+world&lang=en"
Go
import "net/url"
url.QueryEscape("hello world") // → "hello+world"
url.PathEscape("hello world") // → "hello%20world"
values := url.Values{}
values.Add("q", "hello world")
values.Encode() // → "q=hello+world"
Swift
"hello world".addingPercentEncoding(
withAllowedCharacters: .urlQueryAllowed
)
// → "hello%20world"
// Or, the safe way:
var components = URLComponents(string: "https://example.com/search")!
components.queryItems = [URLQueryItem(name: "q", value: "hello world")]
components.url
// → https://example.com/search?q=hello%20world
Bash / curl
# curl will percent-encode the data automatically
curl -G "https://example.com/search" \
--data-urlencode "q=hello world"
# jq does it too
echo '"hello world"' | jq -rR '@uri'
# → "hello%20world"
Double Encoding: The Most Common Bug
Double encoding happens when you encode a value that's already encoded:
Original: "hello world"
Once-encoded: "hello%20world"
Twice-encoded: "hello%2520world" ← BROKEN
└─ %25 is the literal % sign
Three-times: "hello%252520world" ← Even worse
Symptoms: parameters that arrive at the server as literal %20 instead of spaces, redirects that don't go where you expect, or filenames downloaded as my%20document.pdf.
Where Double Encoding Comes From
- Encoding twice in code. Calling
encodeURIComponenton a value already returned byURLSearchParams.toString(). - Server-side frameworks that encode automatically on top of client-side encoding.
- Reverse proxies that re-encode the URL when forwarding to upstream servers.
- Concatenating an encoded URL into another URL as a query parameter without re-encoding it (most common in OAuth
redirect_uriflows).
Debugging tip: If you see %25 in a URL where you don't expect it, you're looking at a double-encoded value. Decode it once and check whether the result still looks encoded. If yes, somewhere in your pipeline the value is being escaped twice.
Reserved Characters Cheat Sheet
These characters have structural meaning in a URL. If they appear in a value rather than as URL syntax, you must percent-encode them:
| Character | Encoded | Meaning when unencoded |
|---|---|---|
: | %3A | Scheme separator (https:) or port (:8080) |
/ | %2F | Path separator |
? | %3F | Start of query string |
# | %23 | Start of fragment |
& | %26 | Query parameter separator |
= | %3D | Query key/value separator |
+ | %2B | Space (in query strings) |
% | %25 | Start of an escape sequence |
@ | %40 | Userinfo separator |
| space | %20 or + | Not valid raw in URLs |
URL Encoding vs Base64 vs Other Encodings
Don't confuse URL encoding with sibling encodings — they solve different problems:
| Encoding | Purpose | Charset |
|---|---|---|
| URL / Percent | Make any byte safe inside a URL | Unreserved + %XX |
| Base64 | Make binary safe inside text (email, JSON) | A-Z a-z 0-9 + / = |
| Base64URL | Base64 that's also URL-safe (JWTs) | A-Z a-z 0-9 - _ |
| HTML entities | Make characters safe inside HTML | &name; or &#NN; |
| Punycode | Encode Unicode in domain names | xn--… |
Each encoding targets a different "safe character set" for a different transport. They're often combined — a JWT (Base64URL) embedded in a URL parameter (URL-encoded if needed) inside an HTML page (HTML-escaped). Each layer is responsible for its own concerns.
Decode URLs and Tokens On Your Phone
BoltKit's Base64Lab handles Base64 and Base64URL on the fly, JWTInspect decodes auth tokens, and JSONPretty formats API responses. 10 essential dev tools, free on iPhone, iPad, and Mac.
Get BoltKit Free