Character encoding (charset) defines how the browser interprets the text on the page. If incorrectly set, accented or non-Latin characters may display incorrectly.
Character encoding is a standardized system that assigns a number to every character. The most popular standard today is UTF-8.
UTF-8 is the most widely used character encoding on the web because it supports all world languages. Its use is highly recommended in all HTML documents.
html
<head>
<meta charset="UTF-8">
</head>
ISO-8859-1 (Latin-1) used to be popular, but UTF-8 has largely replaced it.
html
<meta charset="ISO-8859-1">
Incorrect encoding can cause text to display improperly — such as turning into question marks or strange symbols.
In the following example, multiple language characters are displayed within a UTF-8 encoded HTML document.
html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Character Encoding Example</title>
</head>
<body>
<p>Árvíztűrő tükörfúrógép — 日本語 — русский</p>
</body>
</html>
Always include <meta charset="UTF-8"> at the start of your <head>. This ensures correct display of international characters.
Many developers forget to set encoding or place the meta tag too late. This is especially problematic if the file is saved in a different format.
The BOM (Byte Order Mark) is a hidden character at the beginning of a file that can indicate the character encoding. For UTF-8, it’s not necessary and may even cause display issues.
You can check a file's encoding in your text editor settings or using your browser's developer tools. If issues arise, make sure the file is saved in UTF-8 format.
Select Language
Set theme
© 2025 ReadyTools. All rights reserved.