Need to associate fonts with unmarked text runs containing East Asian characters.Support both East Asian and Western typography and line layout.Have to interwork with East Asian legacy character encodings.The East_Asian_Width is a normative property and provides a useful concept for implementations that For examples, see Section 4, Definitions. Decomposing a character, and applying the East_Asian_Width to a base character and combining marks separately does not yield the expected values. The East_Asian_Width property does not preserve canonical equivalence, because the base characters of canonical decompositions almost always have a different East_Asian_Width than the precomposed characters. The Unicode character property East_Asian_Width provides a default classification of characters, which an implementation can use to decide at runtime whether to treat a character as narrow or wide. Some characters, such as the ideographs, are always wide others are always narrow and some can be narrow or wide, depending on the context. While these extensions have been added to Unicode or mapped to standardized variation sequences, their treatment as wide characters has been retained, and extended for consistency with emoji characters that lack a legacy encoding.Įxcept for a few characters, which are explicitly called out as fullwidth or halfwidth in the Unicode Standard, characters are not duplicated based on distinction in width. In contrast, emoji characters were first developed through the use of extensions of legacy East Asian encodings, such as Shift-JIS, and in such a context they were treated as wide characters. In modern practice, most alphabetic characters are rendered by variable-width fonts using narrow characters, even if their encoding in common legacy sets uses multiple bytes. In contrast, the character width for a fixed-pitch Latin font like Courier is generally 3/5 of an Em. While an Em is customarily the height of the letter “M”, it is the same as the unit width in East Asian fonts, because in these fonts the standard character cell is square. A common name for this unit width is “Em”. Narrow characters are kept together in words or runs that are rotated sideways in vertical text layout.įor a traditional East Asian fixed pitch font, this width translates to a display width of either one half or a whole unit width. Wide characters behave like ideographs they tend to allow line breaks after each character and remain upright in vertical text layout. Layout and line breaking (to cite only two examples) in East Asian context show systematic variations depending on the value of this East_Asian_Width property. For traditional mixed-width East Asian legacy character sets, this classification into narrow and wide corresponds with few exceptions directly to the storage size for each character: a few narrow characters use a single byte per character and all other characters (usually wide) use two or more bytes. This width takes on either of two values: narrow or wide. When dealing with East Asian text, there is the concept of an inherent width of a character.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |