From: Ingo Schwarze Subject: timing-dependent(?) display in xterm(1) To: tech@openbsd.org Date: Sat, 10 May 2025 16:19:06 +0200 Hello, while wotking on VI command line editing mode in ksh(1), i stumbled over the following, which i believe might possibly be a quirk in xterm(1). I'm not yet sure what is going on, hence the question mark after the word "timing-dependent". I'm running xterm(1) in the UTF-8 capable mode that is the default on OpenBSD. First observe that the command $ printf "\xc3\xa9\n" causes xterm(1) to display an "e accent aigu" as expected. This is not timing-dependent, the two following commands display the same "e accent aigu" character just fine: $ printf "\xc3"; printf "\xa9\n" $ printf "\xc3"; sleep 1; printf "\xa9\n" Now observe that the command $ printf "\x80\n" displays a U+FFFD REPLACEMENT CHARACTER because a lonely UTF-8 continuation byte is not valid UTF-8. So far, so good. Now the strangeness begins. The command $ printf "\x80x\n" does *not* display th replacement character, but only the 'x' character, which might be a potential bug (i'm not yet sure whether it is). Finally, if i change the timing(?) as follows, $ printf "\x80"; printf "x\n" i see both the replacement character and the 'x' as expected. Does anybody have an idea what might be going on here? Myself, i suspect this might be a bug in xterm(1) because it seems to me what is displayed should only depend on the sequence of bytes received, not on the times at which the individual bytes arrive. I found this quirk after finding a bug in ed_mov_opt() in vi.c in our ksh(1), writing a patch to fix that bug (i did not yet send that patch out because it is not yet suffieciently tested), and then while testing the patch, even though it causes a stream of bytes that i deem correct, xterm(1) sometimes omits a UTF-8 replacement character even though it receives an UTF-8 continuation byte. This problem persists even when i insert an fflush(3) call after writing each byte to disable stdio buffering in the shell. The problem goes away when i attach egdb(1) to the ksh(1) process and manually step through the ksh(1) code. I suspect the reason that solves the problem is that it causes the bytes to arrive at xterm(1) one by one, with sufficient distance in time. I must say Heisenbugs are among my favourites: as soon as you try to observe them in a debugger, they are no longer there. =:c( I'm not yet sure which is the best course of action. (a) assume xterm(1) is broken, shrug, and fix ksh(1) only for now arguing that the shell is more important than the terminal (and likely the code is much simpler in the shell than in the terminal, so progress will likely be faster) (b) or suspend work on ksh(1), debug xterm(1) first, fix that, then return to the shell once xterm(1) works (because one could maybe argue that a good shell buys us little without a working terminal) (c) or is there an even better option or explanation? Yours, Ingo