From: Klemens Nanni Subject: Re: timing lld --threads for fun and profit To: Landry Breuil , robert@openbsd.org, tech@openbsd.org Date: Tue, 12 Nov 2024 13:13:20 +0000 08.11.2024 16:01, Stuart Henderson пишет: > On 2024/11/08 13:26, Martin Pieuchot wrote: >> On 08/11/24(Fri) 12:22, Landry Breuil wrote: >>> [...] >>> someone(tm) should look into patching lld to avoid using more than >>> MAX(ncpu,5) threads ? In the meantime, i'll probably fix the firefox >>> ports to avoir using MAKE_JOBS for lld but cap it at 5. >> >> Recent lld(1) include the following commits which limit the value to >> 16 by default instead of the number of available CPUs. >> >> See the following commits: >> >> https://github.com/llvm/llvm-project/commit/a8788de1c3f3c8c3a591bd3aae2acee1b43b229a >> https://github.com/llvm/llvm-project/commit/da68d2164efcc1f5e57f090e2ae2219056b120a0 >> >> Robert do you see the same with chromium? Would it make sense to >> backport these diff with a smaller value for OpenBSD? >> > > Certainly helps for reorder_kernel. > > $ sysctl hw.{model,ncpu,version} > hw.model=12th Gen Intel(R) Core(TM) i5-1245U > hw.ncpu=12 > hw.version=ThinkPad T14 Gen 3 > > # \time -l /usr/libexec/reorder_kernel > 7.30 real 6.86 user 3.92 sys > 879280 maximum resident set size > 0 average shared memory size > 0 average unshared data size > 0 average unshared stack size > 147812 minor page faults > 79421 major page faults > 0 swaps > 11833 block input operations > 15643 block output operations > 1 messages sent > 0 messages received > 45 signals received > 46512 voluntary context switches > 7353 involuntary context switches > # vi Makefile > [...] > $ grep ^LINKFL Makefile > LINKFLAGS= -T ld.script -X --warn-common -nopie -Wl,--threads=5 > LINKFLAGS+= -S Can someone explain to me why this work when done in /usr/share/relink/kernel/GENERIC.MP/Makefile but fails when done in Index: sys/arch/amd64/conf/Makefile.amd64 =================================================================== RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v diff -u -p -r1.137 Makefile.amd64 --- sys/arch/amd64/conf/Makefile.amd64 7 Jun 2024 05:17:34 -0000 1.137 +++ sys/arch/amd64/conf/Makefile.amd64 12 Nov 2024 12:49:03 -0000 @@ -86,6 +86,7 @@ COPTIMIZE?= -O2 CFLAGS= ${DEBUG} ${CWARNFLAGS} ${CMACHFLAGS} ${COPTIMIZE} ${COPTS} ${PIPE} AFLAGS= -D_LOCORE -x assembler-with-cpp ${CWARNFLAGS} ${CMACHFLAGS} LINKFLAGS= -T ld.script -X --warn-common -nopie +LINKFLAGS+= -Wl,--threads=5 HOSTCC?= ${CC} HOSTED_CPPFLAGS=${CPPFLAGS:S/^-nostdinc$//} $ cd /usr/src/sys/arch/amd64/compile/GENERIC.MP $ make config ; make config -b /sys/arch/amd64/compile/GENERIC.MP/obj -s /sys /sys/arch/amd64/conf/GENERIC.MP $ make [...] LD="ld" sh makegap.sh 0xcccccccc gapdummy.o ld -T ld.script -X --warn-common -nopie -Wl,--threads=5 -o bsd ${SYSTEM_HEAD} vers.o ${OBJS} ld: error: unknown argument '-Wl,--threads=5' *** Error 1 in /sys/arch/amd64/compile/GENERIC.MP (Makefile:2225 'bsd': @echo ld -T ld.script -X --warn-common -nopie -Wl,--threads=5 -o bsd...) Using --threads=5 (i.e. without -Wl,) works in both places... > # \time -l /usr/libexec/reorder_kernel > 0.41 real 0.26 user 0.09 sys > 100920 maximum resident set size > 0 average shared memory size > 0 average unshared data size > 0 average unshared stack size > 17119 minor page faults > 10 major page faults > 0 swaps > 5 block input operations > 78 block output operations > 1 messages sent > 0 messages received > 28 signals received > 189 voluntary context switches > 2 involuntary context switches > But then no longer has this effect. I looked at 'ktrace -di -tx' for both and both run /usr/bin/ld, so why does one fail, but not the other? What am I missing?