HS Password (10FFF0-10FFFF) (End for UTF-16)

Share:
by G64(2)

Download disabled

The designer of this FontStruction has chosen not to make it available for download from this website by choosing an “All Rights Reserved" license.

Please respect their decision and desist from requesting license changes in the comments.

If you would like to use the FontStruction for a specific project, you may be able to contact the designer directly about obtaining a license.

  • Info:
    Created on 22nd February 2022. Last edited on 22nd February 2022.
  • License:
  • Categories:
    • -
  • Sets:
    • -
  • Tag:
    • -
  • Fave Tags:
    • -

2 Comments

Speaking of UTF-16, in 1989, There were a draft of ISO 10646 that defined 128 groups of 256 planes of 256 rows of 256 cells, which was published in 1990. This allowed for 2,147,483,648 characters, but only 679,477,248 characters could have been encoded due to the policy forbidding byte values of control characters (0x00 to 0x1F and 0x80 to 0x9F) in any byte of the group, plane, row, and cell. For example, the Latin capital A was defined in group 0x20, plane 0x20, row 0x20, and cell 0x41).

One could have encodede the characters three ways:

1: UCS-4, four bytes for every characters, which enabled the simple encoding of all characters.

2: UCS-2, two bytes for every characters, which enabled the encoding of the first plane, 0x20, the Basic Multilanugal Plane, which contained the first 36,864 codepoints, and other planes and groups could have been accessed via ISO 2022 escape sequences.

3: UTF-1, which encoded all the characters in sequences of bytes of varying length (from 1 to 5 bytes, which no byte contained control codes.

In 1990, two initiatives for a universal character set had existed: Unicode, with 16 bits for every character for 65,536 possible characters, and IsO 10646, which I explained above. Software companies refused to accept the complexity and size requirements of said ISO standard and they convinced a number of ISO National Bodies to vote against it. ISO officials became aware that they could not continue to support the standard in the current state and negotiated the unification of their standards with Unicode. The following changes taken place: Lifting of the prohibition on control bytes (which allowed for codepoints like 0x0000103F (ဿ, Myanmar Letter Great Sa)), as well as the synchronization of the repertoire of the Basic Multilangual Plane with that of Unicode.

Many years later, the situation has changed in Unicode: 65,536 characters was not enough and Version 2.0 added the Surrogate mechanism to encode upwards of 1,112,064 characters within 17 planes. ISO 10646 was then limited to contain as many as can be encoded by UTF-16, and no more (a bit over 1.1 million characters instead of more than 679 million). The UCS-4 version was then incorporated with the same limitation, under UTF-32, although it is not very useful outside internal program data.

TL;DR: UTF-16 arose from the original draft that was written in 1989, and published in 1990, then it was unified with Unicode, then 65,536 characters was not enough, so surrogates were defined in Version 2.0

Comment by Bryndan W. Meyerholt (BWM) 22nd february 2022

Typos: "encodede" should have been "encoded", "Multilanugal" should have been "Multilangual", and "IsO" should have been "ISO"

Comment by Bryndan W. Meyerholt (BWM) 22nd february 2022

Also of Interest

GlyphsApp

Get the world’s leading font editor for OSX.

More from the Gallery

Nollaig Shona Unicodeby G64(2)
Font Layer 1by G64(2)
Circle Sansby G64(2)
FS Unicode™ (Plane 0)by G64(2)
zextile eYe/FSby elmoyenique
Dark Shadowby elzero
Emily Playby laynecom
Rotterdam Decoby four

From the Blog

News

The Numbers Competition

News

16 Years of FontStruct

News

Gridfolk: Interview with Zephram