Index | Thread | Search

From:
ori@eigenstate.org
Subject:
Re: [REPOST] ksh: utf8 full width character support for emacs.c
To:
op@omarpolo.com, schwarze@usta.de
Cc:
tech@openbsd.org
Date:
Sun, 06 Apr 2025 22:48:12 -0400

Download raw body.

Thread
  • Ingo Schwarze:

    [REPOST] ksh: utf8 full width character support for emacs.c

  • Quoth Ingo Schwarze <schwarze@usta.de>:
    > Hello Omar,
    > 
    > Omar Polo wrote on Sun, Mar 30, 2025 at 08:37:06PM +0200:
    > 
    > > grapheme clusters (i.e. what a user percieves as a "character")
    > > can be more than one code point long.
    > 
    > That is (more or less) accurate - and ludicrously complicated.
    > There is a long annex to the Unicode standard on this topic,
    >   Unicode Text Segmentation, Unicode Standard Annex #29
    >   https://www.unicode.org/reports/tr29/
    > 
    > Note that "grapheme clusters" are not the same as "user-percieved
    > characters", and there are several different types of grapheme clusters
    > (legacy, extenbded, tailored, ...).
    > 
    
    For extra trivia: there are some Indian languages (Kannada, IIRC,
    is an example), where a combining codepoint combines with multiple
    surrounding glyphs, and not just the one codepoint before it.
    
    Rendering text is hard.
    
    
  • Ingo Schwarze:

    [REPOST] ksh: utf8 full width character support for emacs.c