From: Ingo Schwarze Subject: ksh vi mode: make 'D' work with UTF-8 input To: Pascal Stumpf , Anton Lindqvist Cc: tech@openbsd.org Date: Fri, 25 Apr 2025 23:04:05 +0200 Hello, here is a simple patch to make the "delete to EOL" command (D) work with UTF-8 characters in ksh(1) VI mode. The problem is that after the deletion, the current implementation backs up to the last *byte* remaining on the line, which may be a UTF-8 continuation byte. When you then insert anything, it gets inserted into the middle of the UTF-8 sequence, resulting in invalid encoding. For example, 1. Type two UTF-8 characters. 2. Type one ASCII character. 3. Press ESCAPE. The cursor now sits on the ASCII character. 4. Press D. The ASCII character disappears to the yank buffer, and the cursor now appears to be sitting on the second UTF-8 character, but it is actually sitting on its last byte. 5. Press P. The ASCII character from the yank buffer gets inserted into the middle of the UTF-8 character, resulting in a currupted line similar to: With the patch below, we get this desired result instead: because after the del_range(), we have es->cursor == es->linelen (because del_range() has the side effect of changing es->linelen) and insert == 0 (because 'd' does not initiate insert mode), such that, after the end of the select block, the code enters the default backup code while (es->cursor > 0) if (!isu8cont(es->cbuf[--es->cursor])) break; backing up the whole character and not just its last byte. OK? Ingo Index: bin/ksh/vi.c =================================================================== RCS file: /cvs/src/bin/ksh/vi.c,v diff -u -p -r1.62 vi.c --- bin/ksh/vi.c 25 Apr 2025 18:28:33 -0000 1.62 +++ bin/ksh/vi.c 25 Apr 2025 20:23:14 -0000 @@ -865,8 +865,6 @@ vi_cmd(int argcnt, const char *cmd) case 'D': yank_range(es->cursor, es->linelen); del_range(es->cursor, es->linelen); - if (es->cursor != 0) - es->cursor--; break; case 'g': Index: regress/bin/ksh/edit/vi.sh =================================================================== RCS file: /cvs/src/regress/bin/ksh/edit/vi.sh,v diff -u -p -r1.11 vi.sh --- regress/bin/ksh/edit/vi.sh 25 Apr 2025 18:28:33 -0000 1.11 +++ regress/bin/ksh/edit/vi.sh 25 Apr 2025 20:23:14 -0000 @@ -72,6 +72,8 @@ testseq "one 2.0\0033BD" " # one 2.0\b\b testseq "one ab.cd\0033bDa.\00332bD" \ " # one ab.cd\b\b \b\b\b..\b\b\b\b \b\b\b\b\b" testseq "one two\0033bCrep" " # one two\b\b\b \b\b\brep" +testseq "\0302\0251\0303\0200a\0033DP" \ + " # \0302\0251\0303\0200a\b \b\ba\0303\0200\b\b" # c: Change region. testseq "one two\0033cbrep" " # one two\b\b\bo \b\b\bro\beo\bpo\b"