Character encoding

Character encoding is the essential for the machine-machine and man-machine communication. During the years the encoding has been improved and today there are several standards in use.

Character encoding overview

Parameter ASCII ISO Latin 1 ANSI UTF-8
Bits per character 7 8 8 8-32
Number of characters 95 190 xx xx
Range 0–127 0–255 0–255 0-4294967295

Type of encoding

ASCII

The ASCII (American Standard Code for Information Interchange) encoding

ISO Latin 1 (ISO 8859-1)

Xxxxx

ANSI (Windows-1252)

Xxxxx

UTF-8

The UTF-8 (8-bit UnicodeTransformation Format) endoing is commonly used in web pages and in XML data. The format is backwards compatible with the old ASCII and ISO standards, at the same time as it enables the use of characters for, in principle, all languages.

The encoding uses a a variable number of 8-bit blocks or octets to represent a character. From one to to four octets can be used and the old ASCII and ISO encodings are preserved with the use of a single octet.

There is also a variant named UTF-16 using 16-bit blocks, but this standard is not commonly used.