Index | Thread | Search

From:
Christian Schulte <cs@schulte.it>
Subject:
Re: [PATCH] amd64: import optimized memcmp from FreeBSD
To:
tech@openbsd.org
Date:
Sat, 30 Nov 2024 14:29:39 +0100

Download raw body.

Thread
On 11/29/24 17:08, Mateusz Guzik wrote:
> On Fri, Nov 29, 2024 at 4:49 PM Stuart Henderson <stu@spacehopper.org> wrote:
>>
>> On 2024/11/29 02:01, Mateusz Guzik wrote:
>>> The rep-prefixed cmps is incredibly slow even on modern CPUs.
>>>
>>> The new implementation uses regular cmp to do it.
>>>
>>> The code got augmented to account for retguard, otherwise it matches FreeBSD.
>>
>> Would that make sense in libc too?
>>
> 
> this and the other routines are faster than what's in openbsd libc,
> but they are not the optimal choice due to lack of simd. definitely an
> improvement for the time being.

This has been done before. E.g.:

<https://cvsweb.openbsd.org/src/lib/libc/arch/amd64/string/strlen.S>

Removed nearly 13 years ago in favour of the .c version being faster
(gcc builtin back then), then readded based on NetBSD a couple of years
later. When reading

sys/arch/amd64/amd64/copy.S

lately I thought about rewriting those functions to use simd, but never
looked further into doing it, because I do not know if simd
registers/instructions could be used there at all. Could they? Instead
of working on the same thing every now and then, I would be in favour of
providing routines using the most performant instructions the
architecture has to offer (simd), if possible, and be done with it.
Would not mind writing those, if they would be accepted.

-- 
Christian