UTF-8 Encoder/Decoder
Free online UTF-8 encoder and decoder. Convert text to UTF-8 hex bytes or decode UTF-8 sequences to readable characters. No upload required - perfect for internationalization and debugging encoding issues.
Understanding UTF-8
What is UTF-8 Encoding?
UTF-8 (8-bit Unicode Transformation Format) is a variable-width character encoding capable of encoding all valid Unicode code points. It uses one to four bytes per character, making it efficient for both ASCII and non-ASCII characters. UTF-8's compatibility with ASCII and its ability to represent any character in the Unicode standard has made it the most widely used encoding on the web.
Why Use UTF-8 Encoder/Decoder?
UTF-8 encoding is essential for internationalization (i18n), handling multilingual content, debugging character encoding issues, working with APIs that require specific encoding, and ensuring data integrity when transmitting text across different platforms and systems.
How UTF-8 Encoding Works
UTF-8 uses a variable-length encoding: ASCII characters (U+0000 to U+007F) use 1 byte, Latin characters with accents (U+0080 to U+07FF) use 2 bytes, most common CJK characters use 3 bytes, and rare characters including emojis use 4 bytes. This makes UTF-8 space-efficient for ASCII-heavy content while still supporting all Unicode characters.
UTF-8 Byte Encoding Reference
| Bytes | Unicode Range | Characters |
|---|---|---|
| 1 | U+0000 - U+007F | ASCII (A-Z, a-z, 0-9) |
| 2 | U+0080 - U+07FF | Latin Extended (é, ñ, ü) |
| 3 | U+0800 - U+FFFF | CJK (中文, 日本語, 한국어) |
| 4 | U+10000 - U+10FFFF | Emojis, Historic Scripts |