Index | Thread | Search

From:
Mark Kettenis <mark.kettenis@xs4all.nl>
Subject:
Re: watch(1): fix UTF-8
To:
Job Snijders <job@openbsd.org>
Cc:
tech@openbsd.org
Date:
Wed, 21 May 2025 14:38:28 +0200

Download raw body.

Thread
> Date: Wed, 21 May 2025 12:19:58 +0000
> From: Job Snijders <job@openbsd.org>
> 
> Florian noticed that this results in art which does not spark joy.
> 
> 	$ ftp https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
> 	$ watch cat UTF-8-demo.txt
> 
> I took inspiration from tmux/tmux.c.
> 
> OK?

I think command-line utilities should respect my locale settings; not
force UTF-8 upon me.

So no, I don't think this is ok.

> Index: watch.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/watch/watch.c,v
> diff -u -p -r1.23 watch.c
> --- watch.c	21 May 2025 08:32:10 -0000	1.23
> +++ watch.c	21 May 2025 12:15:54 -0000
> @@ -26,6 +26,7 @@
>  #include <err.h>
>  #include <errno.h>
>  #include <event.h>
> +#include <langinfo.h>
>  #include <locale.h>
>  #include <paths.h>
>  #include <signal.h>
> @@ -124,7 +125,17 @@ main(int argc, char *argv[])
>  	struct event ev_sigint, ev_sighup, ev_sigterm, ev_sigwinch, ev_stdin;
>  	size_t len, rem;
>  	int i, ch;
> +	const char *s;
>  	char *p;
> +
> +	if (setlocale(LC_CTYPE, "en_US.UTF-8") == NULL &&
> +	    setlocale(LC_CTYPE, "C.UTF-8") == NULL) {
> +		if (setlocale(LC_CTYPE, "") == NULL)
> +			errx(1, "invalid LC_ALL, LC_CTYPE or LANG");
> +		s = nl_langinfo(CODESET);
> +		if (strcasecmp(s, "UTF-8") != 0 && strcasecmp(s, "UTF8") != 0)
> +			errx(1, "need UTF-8 locale (LC_CTYPE) but have %s", s);
> +        }
>  
>  	while ((ch = getopt(argc, argv, "cls:wx")) != -1)
>  		switch (ch) {
> 
>