perl-unicode

Editing, cursor motion, and combining characters

1999-11-15 10:00:59
Following the discussion of editing and cursor motion with combining
characters, here is what I have learned over the past few years of
implementation experience:

  o  Cursor motion that follows the visual and logical text order should be
     provided.

  o  Combining characters should not be skipped during cursor motion.  Passing
     over a combining character should provide some visual feedback.

     Markus proposed showing the group of base + combining characters in the
     status line of a terminal emulator, the text editing classes in the Emule
     package from Slangsoft change the color of the combining character while
     leaving the cursor in place, and other approaches have been used.

  o  Deletion should not be done in terms of base character plus following
     combining characters.  When entering Vietnamese or Thai for example, this
     slows editing significantly.

  o  What works best is to delete one character at a time.  In cases of
     scripts like Vietnamese and Thai, this will allow deletion of individual
     accents, vowels and tones.  The only exception are control characters.

     Control characters have to be handled specially.  For example, the ZWJ
     and ZWNJ characters generally act as invisible base characters.  The
     explicit directional controls (i.e. RLO-PDF) *surround* the text they
     affect, and one way to handle them is to delete the beginning and ending
     pairs when there is no more text between them.

To support these behaviors, the editing system should store the text in
decomposed form.  Though it complicates some aspects of the editor design, it
is useful for other reasons.

For a simple example, assume one font in use has all the pre-composed
Vietnamese vowels, but another font only has the combining accent and tone
marks.  Perhaps a third font has only a partial set of pre-composed
characters.  For this and other implementation reasons, it is much easier and
less expensive to compose glyphs from characters that have already been
decomposed than it is to decomposed during the rendering process.
-----------------------------------------------------------------------------
Mark Leisher
Computing Research Lab            I have never made but one prayer to God,
New Mexico State University       a very short one:
Box 30001, Dept. 3CRL                 "Oh Lord, make my enemies ridiculous."
Las Cruces, NM  88003             And God granted it.  -- Voltaire, letter