Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: [PATCH] amd64: import optimized memcmp from FreeBSD]
To:
Mateusz Guzik <mjguzik@gmail.com>
Cc:
tech <tech@openbsd.org>, Stuart <stu@spacehopper.org>
Date:
Sat, 13 Sep 2025 11:06:19 +0200

Download raw body.

Thread
On 13/09/25(Sat) 09:30, Mateusz Guzik wrote:
> On Sat, Jun 7, 2025 at 11:06 AM Martin Pieuchot <mpi@grenadille.net> wrote:
> >
> > On 06/06/25(Fri) 12:02, Crystal Kolipe wrote:
> > > I've ported a cut-down version of this code to i386 in case anybody wants to
> > > look for possible performance gains there.
> > >
> > > It could probably be improved further if there is interest.
> >
> > There is interest.  We just need somebody which will lead the effort, do
> > the testing, report back and integrate the comments from the list.
> >
> > This might not be trivial.
> >
> > Would you like to do it?  Thanks!
> >
> 
> Modulo lfence + retguard addition by hand, this is what is present in
> FreeBSD for years now (both kernel and libc) so I would assume it
> works. ;)

Assuming is a good start, testing it is the next step.

> I understand apprehension concerning just grabbing this code, but I
> can't stress enough how utter garbage the stock BSD asm is for amd64
> (across all of them).

Apprehension is not the issue here.  What is missing is somebody doing
the actual work to push a diff in the tree.  I'd love to see this
happen, sadly I am not doing the work, neither do you nor anybody else.

> This is not a matter of few % in a microbenchmark for the routine
> alone over the stock variant. What you have now is literally several
> time slower than my variant and translates to a visible loss in
> performance even in your kernel build.

I'm well aware.  I don't think we need to be convinced.  We need someone
doing it.

> Bare minimum you could take the C variant from NetBSD which tries to
> do word-sized ops first before resorting to per-byte ops. It still
> avoids rep cmp, which is the crux of the problem and achieves
> performance still lower than with my routine but at least in the same
> realm.
> 
> You can find it here:
> 
> https://cvsweb.netbsd.org/bsdweb.cgi/src/common/lib/libc/string/bcmp.c?rev=1.7.34.3;content-type=text%2Fplain

Who are you talking to?  Who is the "you" that can take any variant?

What is preventing you from doing it?  Why do you try to convince us
instead of sending the diffs asking for tests?

> Per my opening remark, copy_to_user, copyinstr, bcopy and whatever
> else is also artificially heavily penalized and all of that got
> patched up in FreeBSD.  Perf up for grabs for rather moderate effort
> of bringing things over.

Then please bring that to OpenBSD, I'd be delighted to review and test
your diffs.

Thanks,
Martin