[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode and math symbols



martin:
    > I agree very much with you. However, I read Barbara's comments
    > as to that she wants to be closer to semantics than we would.

chris:
    I too was puzzled by what Barbara meant by "meaning" in the context of
    math symbols.

here's the context in which i used "meaning".

    unicode (and other codes) deal with *meaning*, not form.

unfortunately, someone has borrowed my (partial) copy of iso 10646,
so i can't check to see how the columns are labeled.  but what i
meant here is the words that get put into the description column.

    ... also, when the same shape
    can appear both as an ordinary symbol and as a relation, there are
    two different glyphs registered; the justification i used was that
    the side bearings are different, since a difference in meaning
    wasn't permitted.

same here, except the column in the glyph registry is headed "glyph
name or description".  a pure shape description isn't adequate
(although that's often what's present); this also gets into usage
and (minimally) presentation context.

here are several out-of-context quotes from an early version of the
draft character-glyph model generated by members of the u.s. technical
committees x3v1 (counterpart of international sc18/wg8) and x3l2
(international sc2/wg2).  these show the association i have with the
meaning of "meaning".

    From a historical perspective, few differences have been traditionally
    attributed to the notions of "character" and "glyph".  If used at all, the
    term "glyph" was associated with "the" visual image of a "character".  Most
    frequently, the term "character" has been (and still is) used to refer to
    both a unit of information and a visual shape associated with that unit of
    information.
			--------------------

    Consider for a moment the case with the unit of information meaning "one".
    In ISO/IEC 10646 not only are there a large number of characters which
    conceivably "represent" this "unit of information", but there are also a
    number of "characters" which represent a particular form associated with
    this meaning, i.e., the Arabic digit <1> form itself.

			--------------------

    In specifying characters for inclusion in a character set standard, SC2
    normally has recourse to the meaning of a character, and, in particular,
    has the option of unifying two or more forms if it is determined that those
    forms do not represent distinctions in meaning within a particular written
    language, or that the forms represent merely stylistic differences.  On
    the other hand, the glyph registration authority of ISO/IEC 10036 does not
    have recourse to such an analysis, and must, if so requested, register
    all elements of any font so that a unique identification for all glyph
    representations within a font is possible.  ...  If there is a set of
    criteria for distinguishing among two glyphs, it cannot be based on
    distinctions in meaning, but distinctions of form.

			--------------------

     1. A character conveys distinctions in meaning.  A character has no
        intrinsic appearance.

     2. A glyph conveys distinctions in form.  A glyph has no intrinsic meaning.

     3. One or more characters may be depicted by one or more glyph
        representations (instances of an abstract glyph) in a possibly context
        dependent fashion.

    This last point spells out the possible relationship between characters and
    glyph representations.  In its fully general form, this relationship is a
    context sensitive M to N mapping, M>0, N>0.

    In practice, because ISO/IEC 10646 contains "glyph-like" characters, it is
    expected that implementations may choose to "canonicalize" or "normalize"
    such characters by translating them to normative characters.  A display
    subsystem which employs such a technique may require character data be
    normalized prior to display.

    5.1   - Processing Domains: Content and Appearance

    Two primary processing domains may be applied to character information.
    The first pertains to the processing of the content, or meaning of the
    information; and, the second, to the presentation or imaging of the
    content.

(this draft was dated september 1993, and i was part of the joint
committee that drafted it.  i should also note that it has changed
a great deal since then, and i haven't seen the most recent versions.)

sorry, i should have been more precise.
						-- bb