Index | Thread | Search

From:
j@bitminer.ca
Subject:
Re: arm64 LSE support in userland: introduce elf_aux_info?
To:
Tech <tech@openbsd.org>
Date:
Thu, 11 Jul 2024 09:17:44 -0400

Download raw body.

Thread
> > Feedback from other porters would be welcome.

Well I'm sure you are getting a lot from @sthen and others offlist,
here is my input regarding getauxval:

1) you are filtering the truth and providing a BSD-specific set of
"interpretations" of the truth.

The "truth" is the contents of CPU-capability registers.  Why provide
an opinionated interpretation of these, instead of just readable
copies?

I see two answers: detect OpenBSD-specific kernel needs and
capabilities (e.g. saving extra registers at context switch).  And
telling "porters" about the CPU capabilities.

For the kernel needs, and really anything in base, it makes sense.
You control the kernel and need a quick and easy access to CPU
feature bits.

2) As for porters, look, Linux took a lousy path and cannot fit the
permutations of all CPU generations into a couple of 64-bit registers.
And FreeBSD took a similar path, and now you are looking at replicating
that.  And the necessary logic is going to be bit-stuffing of
combinations of features into fewer bits.  If not this year then
by next decade.  Yuch.  It's not the truth.  It's an opinion.

Here is a good example, the "blis" high-performance linpack substitute.
They have 1390 lines of code (bli_cpuid.c) to transform amd64
capabilities, aarch64 capabilities, arm7, and power capabilities
to internally used compile and runtime flags.  They already use
getauxval but only as an introduction to aarch64 cpu analysis.  The
rest is parsing available CPU capability registers, or parsing
/proc/cpuinfo on Linux.

"blis" uses every trick in the book and they still get it somewhat
wrong (their aarch64 code is a mess.)  I'm working on an OpenBSD
port and I have to parse /var/run/dmesg.boot to get the flags needed
for aarch64.

Look at the patch suggested by @jca in elf.h for HWCAP2, there is
only 19 bits left unused.  How many years of ARM architecture
evolution will that last?  Three?  Four?

+#define        HWCAP2_HBC              0x0000100000000000ul

3) OK, I'm sounding grumpy, sorry.

Here is my solution: provide readable copies of CPU capability
registers.  Period.

Do a getauxval if you want, but make the actual truth available to
the code as simple copies of the registers.  And of course in sysctl
(mostly done) as well as a libc equivalent.


thanks for reading


J