From: Alexander Bluhm Subject: Re: Unlock udp(4) somove() To: Vitaliy Makkoveev Cc: tech@openbsd.org Date: Sun, 14 Jul 2024 14:21:07 +0200 On Sat, Jul 13, 2024 at 06:02:27AM +0300, Vitaliy Makkoveev wrote: > I propose to always schedule somove() thread as we do for tcp(4) case. I > do understand that this brings delay to packet processing, but for non > spliced udp sockets we always schedule soreceive() threads. So yes, the > delay, but it is comparable with non spliced case and it should not be > such terrible. The profit is obvious - many simultaneous somove() > threads with the only shared solock() around pru_send(). Also bluhm@ > can easily make udp(4) input parallel, because sorwakeup() stops touch > spliced peer. When I introduced socket splicing, my experience was that a single task makes TCP faster, and direct somove makes UDP faster. The idea is that TCP data accumulates in the socket buffer. The task calls yield() to slow things down, and moves huges chunks of data with little overhead. For UDP on the other hand best performance is achieved when every packet is sent immediately. Additional task and quueing is just overhead. Of course that was measuered before any unlocking was done. Things may have changed. Currently my performance test machines are blocked by other jobs. I will report in a few days. My idea was to grab so->so_lock around somove() in sorwakeup(). But I have no diff for that yet, so I don't know if its works. > I successfully performed regress/sys/kern/sosplice many times, but I'm > not sosplice() user, so I'm very interesting in independent tests. Maybe I use more CPUs or things change as I have a multiple machine setup. I managed to crash kernel with your diff. START sys/kern/sosplice/loop 2024-07-14T11:14:21Z rm -f a.out [Ee]rrs mklog *.core y.tab.h *.log ktrace.out stamp-* ==== run-chain-ipv4-prototcp-count1 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 1 -p tcp ==== run-chain-ipv4-prototcp-count2 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 2 -p tcp ==== run-chain-ipv4-prototcp-count3 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 3 -p tcp ==== run-chain-ipv4-prototcp-count127 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 127 -p tcp ==== run-chain-ipv4-prototcp-count128 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 128 -p tcp ==== run-chain-ipv4-prototcp-count129 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 129 -p tcp ==== run-chain-ipv4-prototcp-count200 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 200 -p tcp ==== run-loop-ipv4-prototcp ==== SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/loop.pl -4 -p tcp ==== run-chain-ipv4-protoudp-count1 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 1 -p udp ==== run-chain-ipv4-protoudp-count2 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 2 -p udp ==== run-chain-ipv4-protoudp-count3 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 3 -p udp ==== run-chain-ipv4-protoudp-count127 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 127 -p udp ==== run-chain-ipv4-protoudp-count128 ==== ulimit -n 500; SUDO="" perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 128 -p udp Timeout, server ot1 not responding. [-- MARK -- Sun Jul 14 13:10:00 2024] panic: uvm_fault(0xd7acf3d4, 0x45e37000, 0, 1) -> e Stopped at db_enter+0x4: popl %ebp TID PID UID PRFLAGS PFLAGS CPU COMMAND *518810 70054 0 0x18000002 0 1 perl dter() at db_enter+0x4 panic(d0c6bdd9) at panic+0x7a kpageflttrap(f6017ff8,45e376ab) at kpagefltt+0x137 trap(f6017ff8) at trap+0x25f calltrap() at calltrap+0xc solock(f888247c) at solock+0x8 soclose(f888219c,0) at soclose+0x24b soo_close(f9105704,d770db08) at soo_close+0x1e fdrop(f9105704,d770db08) at fdrop+0x6a closef(f9105704,d770db08) at closef+0x9d fdrelease(d770db08,104) at fdrelease+0x7a sys_close(d770db08,f6018160,f6018158) at sys_close+0x54 syscall(f60181a0) at syscall+0x41b Xsyscall_untramp() at Xsyscall_untramp+0xa9 end of kernel https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{1}> [-- MARK -- Sun Jul 14 13:15:00 2024] ddb{1}> show panic *cpu1: uvm_fault(0xd7acf3d4, 0x45e37000, 0, 1) -> e ddb{1}> trace db_enter() at db_enter+0x4 panic(d0c6bdd9) at panic+0x7a kpageflttrap(f6017ff8,45e376ab) at kpageflttrap+0x137 trap(f6017ff8) at trap+0x25f calltrap() at calltrap+0xc solock(f888247c) at solock+0x8 soclose(f888219c,0) at soclose+0x24b soo_close(f9105704,d770db08) at soo_close+0x1e fdrop(f9105704,d770db08) at fdrop+0x6a closef(f9105704,d770db08) at closef+0x9d fdrelease(d770db08,104) at fdrelease+0x7a sys_close(d770db08,f6018160,f6018158) at sys_close+0x54 syscall(f60181a0) at syscall+0x41b Xsyscall_untramp() at Xsyscall_untramp+0xa9 end of kernel ddb{1}> show register ds 0x10 es 0x10 fs 0x20 gs 0 edi 0xd0c6bdd9 acx100_txpower_maxim+0x11632 esi 0 ebp 0xf6017f70 ebx 0xf5537624 edx 0x3fd ecx 0x9f839e0c eax 0x34 eip 0xd04d1684 db_enter+0x4 cs 0x8 eflags 0x202 esp 0xf6017f70 ss 0x10 db_enter+0x4: popl %ebp ddb{1}> ps PID TID PPID UID S FLAGS WAIT COMMAND *70054 518810 6287 0 7 0x18000002 perl 6287 437197 75914 0 3 0x810008a sigsusp sh 75914 475347 16112 0 3 0x1810008a sigsusp make 16112 269880 88646 0 3 0x810008a sigsusp sh 7372 241566 0 0 3 0x14200 bored sosplice 88646 48939 47959 0 3 0x1810008a sigsusp make 79489 54558 36952 0 3 0x18100082 piperd gzip 36952 394482 47959 0 3 0x8100082 piperd pax 47959 163299 74133 0 3 0x18000082 piperd perl 74133 373192 9311 0 3 0x810008a sigsusp ksh 9311 422502 68234 0 3 0x18000098 kqread sshd-session 68234 128656 64665 0 3 0x18000092 kqread sshd-session 55010 169129 1 0 3 0x18100083 ttyin getty 21292 215376 1 0 3 0x18100083 ttyin getty 77752 313340 1 0 3 0x18100083 ttyin getty 54794 190470 1 0 3 0x18100083 ttyin getty 75108 399121 1 0 3 0x18100083 ttyin getty 71474 495094 1 0 3 0x18100083 ttyin getty 90381 254323 1 0 3 0x18100098 kqread cron 60267 279114 1 99 3 0x19100090 kqread sndiod 57918 143188 1 110 3 0x18100090 kqread sndiod 70082 514883 80485 95 3 0x19100092 kqread smtpd 49077 259407 80485 103 3 0x19100092 kqread smtpd 21400 393556 80485 95 3 0x19100092 kqread smtpd 12451 3055 80485 95 3 0x18100092 kqread smtpd 66040 475731 80485 95 3 0x19100092 kqread smtpd 85842 293491 80485 95 3 0x19100092 kqread smtpd 80485 501146 1 0 3 0x18100080 kqread smtpd 19222 375233 24870 91 3 0x18000092 kqread snmpd_metrics 98568 276831 24870 91 3 0x19100092 kqread snmpd 24870 97973 1 0 3 0x18100080 kqread snmpd 64665 240144 1 0 3 0x18000088 kqread sshd 10838 392048 0 0 3 0x14200 acct acct 2796 438359 0 0 3 0x14280 nfsidl nfsio 70630 150278 0 0 3 0x14280 nfsidl nfsio 66783 372686 0 0 3 0x14280 nfsidl nfsio 61143 413391 0 0 3 0x14280 nfsidl nfsio 69255 424357 1 0 3 0x18100080 kqread ntpd 26443 206969 44938 83 3 0x18100092 kqread ntpd 44938 423998 1 83 3 0x19100092 kqread ntpd 47289 235673 26855 74 3 0x19100092 bpf pflogd 26855 198014 1 0 3 0x18000080 sbwait pflogd 15087 400620 37960 73 3 0x19100090 kqread syslogd 37960 496260 1 0 3 0x18100082 sbwait syslogd 14125 350848 38418 77 3 0x18100092 kqread dhcpleased 22940 181885 38418 77 3 0x18100092 kqread dhcpleased 38418 478101 1 0 3 0x18000080 kqread dhcpleased 36279 475992 621 115 3 0x18100092 kqread slaacd 22685 403716 621 115 3 0x18100092 kqread slaacd 621 96825 1 0 3 0x18100080 kqread slaacd 33437 252448 0 0 3 0x14200 bored smr 82784 20232 0 0 3 0x14200 pgzero zerothread 99575 154226 0 0 3 0x14200 aiodoned aiodoned 38234 446220 0 0 3 0x14200 syncer update 10140 113952 0 0 3 0x14200 cleaner cleaner 74682 353433 0 0 3 0x14200 reaper reaper 47178 433325 0 0 3 0x14200 pgdaemon pagedaemon 47605 16220 0 0 3 0x14200 bored wsdisplay0 98446 398209 0 0 3 0x14200 usbtsk usbtask 74779 207873 0 0 3 0x14200 usbatsk usbatsk 27412 284451 0 0 3 0x14200 bored sensors 94001 57270 0 0 3 0x40014200 acpi0 acpi0 42423 327336 0 0 7 0x40014200 idle7 89543 490248 0 0 7 0x40014200 idle6 1491 239642 0 0 7 0x40014200 idle5 68189 321989 0 0 7 0x40014200 idle4 82999 132562 0 0 7 0x40014200 idle3 92964 379359 0 0 7 0x40014200 idle2 14916 466902 0 0 3 0x40014200 idle1 92848 439672 0 0 3 0x14200 bored softnet3 7985 174055 0 0 3 0x14200 bored softnet2 23344 138203 0 0 3 0x14200 bored softnet1 29716 347099 0 0 3 0x14200 bored softnet0 68566 270200 0 0 3 0x14200 bored systqmp 20550 486107 0 0 3 0x14200 bored systq 85119 19194 0 0 3 0x14200 tmoslp softclockmp 25863 361548 0 0 3 0x40014200 tmoslp softclock 31920 260228 0 0 7 0x40014200 idle0 39418 2809 0 0 3 0x14200 kmalloc kmthread 1 218001 0 0 3 0x8000082 wait init 0 0 -1 0 3 0x10200 scheduler swapper ddb{1}> x/s version version: OpenBSD 7.5-current (GENERIC.MP) #0: Sun Jul 14 10:22:41 CEST 2024\012 root@ot1.obsd-lab.genua.de:/usr/src/sys/arch/i386/compile/GENERIC.MP\012 00000450 : solock+0x8 == 0x458 solock(): /usr/src/sys/kern/uipc_socket2.c:350 450: 55 push %ebp 451: 89 e5 mov %esp,%ebp 453: 8b 45 08 mov 0x8(%ebp),%eax /usr/src/sys/kern/uipc_socket2.c:351 456: 8b 08 mov (%eax),%ecx * 458: 8b 49 04 mov 0x4(%ecx),%ecx 45b: 8b 11 mov (%ecx),%edx 45d: b9 00 00 00 00 mov $0x0,%ecx 462: 83 fa 02 cmp $0x2,%edx 465: 74 0a je 471 467: 83 fa 18 cmp $0x18,%edx 46a: 74 05 je 471 /usr/src/sys/kern/uipc_socket2.c:357 kern/uipc_socket2.c 348 void 349 solock(struct socket *so) 350 { * 351 switch (so->so_proto->pr_domain->dom_family) { 352 case PF_INET: 353 case PF_INET6: 354 NET_LOCK(); 355 break; 356 default: 357 rw_enter_write(&so->so_lock); 358 break; 359 } 360 } eax is 0x34, so I would say pointer to socket *so is invalid. 00000ce0 : soclose+0x24b == 0xf2b soclose(): ... /usr/src/sys/kern/uipc_socket.c:497 f25: 57 push %edi f26: e8 fc ff ff ff call f27 * f2b: 83 c4 04 add $0x4,%esp /usr/src/sys/kern/uipc_socket.c:498 kern/uipc_socket.c 492 if (issplicedback(so)) { 493 int freeing = SOSP_FREEING_WRITE; 494 495 if (so->so_sp->ssp_soback == so) 496 freeing |= SOSP_FREEING_READ; * 497 solock(so2); 498 sounsplice(so->so_sp->ssp_soback, so, freeing); 499 sounlock(so2); 500 } It was called from here. bluhm