From: Ingo Schwarze Subject: Re: smokeping build fails with new perl (perl segfault, locale-related?) To: Andrew Hewus Fresh Cc: Alexander Bluhm , tech Date: Fri, 17 May 2024 22:14:30 +0200 Hi Andrew, Andrew Fresh wrote on Fri, May 17, 2024 at 09:59:10AM -0700: > On Fri, May 17, 2024 at 06:00:32PM +0200, Alexander Bluhm wrote: >> On Fri, May 17, 2024 at 04:47:23PM +0200, Alexander Bluhm wrote: >>> Seems to be a general problem with new Perl, no smokeping involved. >>> >>> $ perl -MPOSIX -e 'setlocale(LC_NUMERIC,"")' >>> Unknown locale category 4 at -e line 1. >>> Segmentation fault (core dumped) >> This fixes the segfault. The warning stays. > I tested this in bleadperl, and that fixes it too. Unless it's specific > to our build options and such, so I'll have to do more testing. The crash is certainly specific to our build options. In particular, in hints/openbsd.sh, we have: # OpenBSD's locale support is not that complete yet ccflags="-DNO_LOCALE_NUMERIC -DNO_LOCALE_COLLATE $ccflags" You added that on 2024/05/14, i.e. a few days ago. Now, apparently, Perl is supposed to build on systems that do not provide any support for LC_NUMERIC, so we have definitely found an upstream bug here. Take this with a grain of salt - the file /usr/src/gnu/usr.bin/perl/locale.c has become so ridiculously complicated with Perl 5.38, with large amounts of #ifdef and redirection, that it's hard to judge which combinations are supposed to work. Whether setting that option -DNO_LOCALE_NUMERIC on OpenBSD is the best way to configure Perl is another question. On the one hand, that LC_NUMERIC has no effect in the OpenBSD base system is not a matter of "support is not that complete yet", but instead that's a conscious decision, taken to improve reliability and security. See the CAVEATS section in setlocale(3) for details. Consequently, telling Perl that OpenBSD does not support LC_NUMERIC is unlikely to disable any important functionality we want to have, so doing that might possibly be quite OK. Then again, our setlocale(3) libc function is certainly designed to correctly set and retrieve all parts of the locale, including LC_NUMERIC, see the test program below. Only the various C library functions then ignore LC_NUMERIC after it is set. So maybe it would be more robust to just let Perl use our setlocale(3) support for LC_NUMERIC, rather than diverting it into code paths that are apparently poorly tested by upstream? By the way, we also support the thread-safe locale functions newlocale(3) and uselocale(3). But while according to nm -u, libperl.so does call setlocale(3), it does not call newlocale(3). I did not investigate what exactly in our Perl build disables newlocale(3). But if Perl does a lot of switching back and forth of locales (and comments in locale.c indicate that might be the case), then enabling newlocale(3) might improve efficiency because uselocale(3) is typically much faster than setlocale(3). Yours, Ingo $ ./setlocale setlocale(LC_ALL, NULL) = C setlocale(LC_NUMERIC, "Eressea.UTF-8") = Eressea.UTF-8 setlocale(LC_ALL, NULL) = C/C/C/Eressea.UTF-8/C/C setlocale(LC_NUMERIC, "C") = C setlocale(LC_ALL, NULL) = C $ cat setlocale.c #include #include #include int main(void) { const char *in; char *out; if ((out = setlocale(LC_ALL, NULL)) == NULL) errx(1, "setlocale(LC_ALL, NULL) failed"); printf("setlocale(LC_ALL, NULL) = %s\n", out); in = "Eressea.UTF-8"; if ((out = setlocale(LC_NUMERIC, in)) == NULL) errx(1, "setlocale(LC_NUMERIC, \"%s\") failed", in); printf("setlocale(LC_NUMERIC, \"%s\") = %s\n", in, out); if ((out = setlocale(LC_ALL, NULL)) == NULL) errx(1, "setlocale(LC_ALL, NULL) failed"); printf("setlocale(LC_ALL, NULL) = %s\n", out); in = "C"; if ((out = setlocale(LC_NUMERIC, in)) == NULL) errx(1, "setlocale(LC_NUMERIC, \"%s\") failed", in); printf("setlocale(LC_NUMERIC, \"%s\") = %s\n", in, out); if ((out = setlocale(LC_ALL, NULL)) == NULL) errx(1, "setlocale(LC_ALL, NULL) failed"); printf("setlocale(LC_ALL, NULL) = %s\n", out); return 0; }