Download raw body.
clang: enable blake3 asm optimizations on amd64?
On 2024/02/18 10:53, Mark Kettenis wrote:
> > Date: Sun, 18 Feb 2024 09:25:38 +0000
> > From: Stuart Henderson <stu@spacehopper.org>
> >
> > On 2024/02/17 20:59, Christian Weisgerber wrote:
> > > From some cursory grepping, I don't think BLAKE3 is used much if
> > > at all in clang. So I don't know if this buys us anything. It
> > > does not reduce the time required by clang to compile itself.
> >
> > Any idea how to make sure that this code is exercised?
> >
> > > +.if ${MACHINE_ARCH} == "amd64"
> > > +SRCS+= blake3_sse2_x86-64_unix.S \
> > > + blake3_sse41_x86-64_unix.S \
> > > + blake3_avx2_x86-64_unix.S \
> > > + blake3_avx512_x86-64_unix.S
> > > +.endif
> >
> > I have a suspicion that AVX512 might not work with OpenBSD.
> > We've had a diagnosed problem with this in node, and now a suspected
> > one in GHC.
>
> AVX512 doesn't work with OpenBSD. Code is supposed to check whether
> AVX512 has been enabled by the OS (in addition to checking the CPUID
> bits). But I wouldn't be surprised if there is code out there that
> just checks the CPUID bits and just assumes the OS has enabled support
> if they're present.
Ah, here was the commit that fixed node:
https://github.com/simdutf/simdutf/pull/243/commits/d00dc3cae3d22880cad7f3e60892f729a264ad2d
src/gnu/llvm/llvm/lib/Support/BLAKE3/blake3_dispatch.c does look like it
has a similar check (using xgetbv) so I think there's a good chance it
will work correctly.
clang: enable blake3 asm optimizations on amd64?