From: j@bitminer.ca Subject: Re: arm64 LSE support in userland: introduce elf_aux_info? To: Tech Date: Thu, 11 Jul 2024 09:17:44 -0400 > > Feedback from other porters would be welcome. Well I'm sure you are getting a lot from @sthen and others offlist, here is my input regarding getauxval: 1) you are filtering the truth and providing a BSD-specific set of "interpretations" of the truth. The "truth" is the contents of CPU-capability registers. Why provide an opinionated interpretation of these, instead of just readable copies? I see two answers: detect OpenBSD-specific kernel needs and capabilities (e.g. saving extra registers at context switch). And telling "porters" about the CPU capabilities. For the kernel needs, and really anything in base, it makes sense. You control the kernel and need a quick and easy access to CPU feature bits. 2) As for porters, look, Linux took a lousy path and cannot fit the permutations of all CPU generations into a couple of 64-bit registers. And FreeBSD took a similar path, and now you are looking at replicating that. And the necessary logic is going to be bit-stuffing of combinations of features into fewer bits. If not this year then by next decade. Yuch. It's not the truth. It's an opinion. Here is a good example, the "blis" high-performance linpack substitute. They have 1390 lines of code (bli_cpuid.c) to transform amd64 capabilities, aarch64 capabilities, arm7, and power capabilities to internally used compile and runtime flags. They already use getauxval but only as an introduction to aarch64 cpu analysis. The rest is parsing available CPU capability registers, or parsing /proc/cpuinfo on Linux. "blis" uses every trick in the book and they still get it somewhat wrong (their aarch64 code is a mess.) I'm working on an OpenBSD port and I have to parse /var/run/dmesg.boot to get the flags needed for aarch64. Look at the patch suggested by @jca in elf.h for HWCAP2, there is only 19 bits left unused. How many years of ARM architecture evolution will that last? Three? Four? +#define HWCAP2_HBC 0x0000100000000000ul 3) OK, I'm sounding grumpy, sorry. Here is my solution: provide readable copies of CPU capability registers. Period. Do a getauxval if you want, but make the actual truth available to the code as simple copies of the registers. And of course in sysctl (mostly done) as well as a libc equivalent. thanks for reading J