Download raw body.
[PATCH] amd64: import optimized memcmp from FreeBSD
On 11/29/24 17:08, Mateusz Guzik wrote: > On Fri, Nov 29, 2024 at 4:49 PM Stuart Henderson <stu@spacehopper.org> wrote: >> >> On 2024/11/29 02:01, Mateusz Guzik wrote: >>> The rep-prefixed cmps is incredibly slow even on modern CPUs. >>> >>> The new implementation uses regular cmp to do it. >>> >>> The code got augmented to account for retguard, otherwise it matches FreeBSD. >> >> Would that make sense in libc too? >> > > this and the other routines are faster than what's in openbsd libc, > but they are not the optimal choice due to lack of simd. definitely an > improvement for the time being. This has been done before. E.g.: <https://cvsweb.openbsd.org/src/lib/libc/arch/amd64/string/strlen.S> Removed nearly 13 years ago in favour of the .c version being faster (gcc builtin back then), then readded based on NetBSD a couple of years later. When reading sys/arch/amd64/amd64/copy.S lately I thought about rewriting those functions to use simd, but never looked further into doing it, because I do not know if simd registers/instructions could be used there at all. Could they? Instead of working on the same thing every now and then, I would be in favour of providing routines using the most performant instructions the architecture has to offer (simd), if possible, and be done with it. Would not mind writing those, if they would be accepted. -- Christian
[PATCH] amd64: import optimized memcmp from FreeBSD