[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

composite characters and dotless ones



I am not sure if I mentioned this before, but in Unicode the canonical
decomposition of, for example
  <iacute(003D)>
is
  <i(0069)>+<acute(0301)>
not
  <dotlessi(0131)><acute(0301)> .

Indeed, page 6-7 explicitly states that these combinations give two
distinct characters.  Further, it states that in cases where the dot
is preserved and the diacritic is added above the dot, the
decomposition is as a double diacritic:

<i(0069)>+<overdot(0307)><acute(0301)>

hmmmm.

Such decomposition is necessary and for many purposes the canonical
Unicode one is the only correct one; its being "wrong" for making a
composite glyph is, of course, irrelevant to Unicode itself (but not
to typesetting applications that process Unicode documents).


chris