Mailing List Archive

On Tue, Feb 17, 2026 at 11:53 AM Mateusz Guzik <mjguzik@gmail.com> wrote: > > On Tue, Feb 17, 2026 at 9:23 AM Martin Pieuchot <mpi@grenadille.net> wrote: > > > > On 27/12/25(Sat) 06:24, Mateusz Guzik wrote: > > > Example microbenchmark doing fstat in 8 threads (ops/s): > > > before: 2305501 > > > after: 5891611 (+155%) > > > > > > Booted with WITNESS et al, survived several kernel builds no problem. > > > > > > Needs more testing of course. > > > > Here are some numbers building a kernel and libLLVM on a amd64 i9 w/ 24 > > cores and an arm64 Ampere Altra w/ 80 cores. > > > > vanilla - amd64 - kernel -j24 > > ============================= > > 0m47.40s real 10m41.77s user 4m00.12s system (cold) > > 0m48.19s real 10m59.92s user 4m00.97s system > > 0m49.22s real 11m19.23s user 4m06.63s system > > > > anderson - amd64 - kernel -j24 > > ============================== > > 0m48.01s real 10m53.56s user 4m03.85s system (cold) > > 0m49.65s real 11m26.71s user 4m07.10s system > > 0m50.00s real 11m36.01s user 4m07.08s system > > > > vanilla - arm64 - kernel -j48 > > ============================= > > 1m10.41s real 14m06.83s user 24m28.19s system (cold) > > 1m11.21s real 14m01.66s user 24m37.54s system > > 1m11.37s real 14m04.17s user 24m33.37s system > > > > anderson - arm64 - kernel -j48 > > ============================== > > 1m11.02s real 13m59.98s user 24m42.46s system (cold) > > 1m11.48s real 14m01.83s user 24m43.20s system > > 1m11.54s real 14m03.91s user 24m41.63s system > > > > vanilla - amd64 - libLLVM -j24 > > ============================== > > 11m51.87s real 240m05.22s user 34m45.38s system (cold) > > 11m55.73s real 240m17.56s user 35m47.06s system > > 12m02.38s real 241m32.83s user 36m37.51s system > > > > anderson - amd64 - libLLVM -j24 > > =============================== > > 11m56.16s real 241m59.39s user 34m52.01s system (cold) > > 11m47.98s real 237m43.06s user 35m31.95s system > > 11m56.02s real 239m36.04s user 36m06.48s system > > > > vanilla - arm64 - libLLVM -j48 > > ============================== > > 18m37.50s real 569m58.77s user 267m49.27s system > > 18m44.55s real 570m28.07s user 270m17.30s system > > 18m45.46s real 569m55.88s user 271m13.74s system > > > > anderson - arm64 - libLLVM -j48 > > =============================== > > 18m22.51s real 569m20.67s user 261m34.81s system > > 18m31.35s real 571m37.11s user 262m50.73s system > > 18m34.12s real 569m40.94s user 266m25.65s system > > > > > > No speed ups at all when building at lower scale lines up with my own > results, which I mentioned here: > https://marc.info/?l=openbsd-tech&m=176631121132731&w=2 > > Per that post, the primary problem concerns page allocation and the > way mutexes are implemented > > The small speed up at bigger scale is presumably also only there > because of the above problem -- if it did not exist, the speed up > would be bigger. I guess a case for inclusion can be made by comparing a profile before/after with -j 48. Given the win, I expect kernel_lock time dropped significantly but it also increased cache-bouncing overhead elsewhere, which in turn artificially lowered the win.

2026-02-17 08:23 Martin Pieuchot:
[PATCH] convert mpl ticket lock to anderson's lock
- 2026-02-17 10:53 Mateusz Guzik:
  [PATCH] convert mpl ticket lock to anderson's lock
- - 2026-02-17 13:50 Mateusz Guzik:
    [PATCH] convert mpl ticket lock to anderson's lock
  - 2026-02-17 18:28 Theo de Raadt:
    [PATCH] convert mpl ticket lock to anderson's lock
  - - 2026-02-17 19:25 Theo de Raadt:
      [PATCH] convert mpl ticket lock to anderson's lock