Index | Thread | Search

From:
Alexander Bluhm <bluhm@openbsd.org>
Subject:
Re: Unlock udp(4) somove()
To:
Vitaliy Makkoveev <mvs@openbsd.org>
Cc:
tech@openbsd.org
Date:
Sun, 14 Jul 2024 14:21:07 +0200

Download raw body.

Thread
On Sat, Jul 13, 2024 at 06:02:27AM +0300, Vitaliy Makkoveev wrote:
> I propose to always schedule somove() thread as we do for tcp(4) case. I
> do understand that this brings delay to packet processing, but for non
> spliced udp sockets we always schedule soreceive() threads. So yes, the
> delay, but it is comparable with non spliced case and it should not be
> such terrible. The profit is obvious - many simultaneous somove()
> threads with the only shared solock() around pru_send(). Also  bluhm@
> can easily make udp(4) input parallel, because sorwakeup() stops touch
> spliced peer.

When I introduced socket splicing, my experience was that a single
task makes TCP faster, and direct somove makes UDP faster.

The idea is that TCP data accumulates in the socket buffer.  The
task calls yield() to slow things down, and moves huges chunks of
data with little overhead.

For UDP on the other hand best performance is achieved when every
packet is sent immediately.  Additional task and quueing is just
overhead.

Of course that was measuered before any unlocking was done.  Things
may have changed.  Currently my performance test machines are blocked
by other jobs.  I will report in a few days.

My idea was to grab so->so_lock around somove() in sorwakeup().
But I have no diff for that yet, so I don't know if its works.

> I successfully performed regress/sys/kern/sosplice many times, but I'm
> not sosplice() user, so I'm very interesting in independent tests.

Maybe I use more CPUs or things change as I have a multiple machine
setup.  I managed to crash kernel with your diff.

START	sys/kern/sosplice/loop	2024-07-14T11:14:21Z

rm -f a.out [Ee]rrs mklog *.core y.tab.h       *.log ktrace.out stamp-*  

==== run-chain-ipv4-prototcp-count1 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 1 -p tcp 

==== run-chain-ipv4-prototcp-count2 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 2 -p tcp 

==== run-chain-ipv4-prototcp-count3 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 3 -p tcp 

==== run-chain-ipv4-prototcp-count127 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 127 -p tcp 

==== run-chain-ipv4-prototcp-count128 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 128 -p tcp 

==== run-chain-ipv4-prototcp-count129 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 129 -p tcp 

==== run-chain-ipv4-prototcp-count200 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 200 -p tcp 

==== run-loop-ipv4-prototcp ====
SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/loop.pl -4 -p tcp 

==== run-chain-ipv4-protoudp-count1 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 1 -p udp 

==== run-chain-ipv4-protoudp-count2 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 2 -p udp 

==== run-chain-ipv4-protoudp-count3 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 3 -p udp 

==== run-chain-ipv4-protoudp-count127 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 127 -p udp 

==== run-chain-ipv4-protoudp-count128 ====
ulimit -n 500;  SUDO=""  perl /usr/src/regress/sys/kern/sosplice/loop/chain.pl -4 -c 128 -p udp 
Timeout, server ot1 not responding.

[-- MARK -- Sun Jul 14 13:10:00 2024]
panic: uvm_fault(0xd7acf3d4, 0x45e37000, 0, 1) -> e
Stopped at      db_enter+0x4:   popl    %ebp
    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
*518810  70054      0  0x18000002          0    1  perl
dter() at db_enter+0x4
panic(d0c6bdd9) at panic+0x7a
kpageflttrap(f6017ff8,45e376ab) at kpagefltt+0x137
trap(f6017ff8) at trap+0x25f
calltrap() at calltrap+0xc
solock(f888247c) at solock+0x8
soclose(f888219c,0) at soclose+0x24b
soo_close(f9105704,d770db08) at soo_close+0x1e
fdrop(f9105704,d770db08) at fdrop+0x6a
closef(f9105704,d770db08) at closef+0x9d
fdrelease(d770db08,104) at fdrelease+0x7a
sys_close(d770db08,f6018160,f6018158) at sys_close+0x54
syscall(f60181a0) at syscall+0x41b
Xsyscall_untramp() at Xsyscall_untramp+0xa9
end of kernel
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{1}> [-- MARK -- Sun Jul 14 13:15:00 2024]

ddb{1}> show panic
*cpu1: uvm_fault(0xd7acf3d4, 0x45e37000, 0, 1) -> e

ddb{1}> trace
db_enter() at db_enter+0x4
panic(d0c6bdd9) at panic+0x7a
kpageflttrap(f6017ff8,45e376ab) at kpageflttrap+0x137
trap(f6017ff8) at trap+0x25f
calltrap() at calltrap+0xc
solock(f888247c) at solock+0x8
soclose(f888219c,0) at soclose+0x24b
soo_close(f9105704,d770db08) at soo_close+0x1e
fdrop(f9105704,d770db08) at fdrop+0x6a
closef(f9105704,d770db08) at closef+0x9d
fdrelease(d770db08,104) at fdrelease+0x7a
sys_close(d770db08,f6018160,f6018158) at sys_close+0x54
syscall(f60181a0) at syscall+0x41b
Xsyscall_untramp() at Xsyscall_untramp+0xa9
end of kernel

ddb{1}> show register
ds                  0x10
es                  0x10
fs                  0x20
gs                     0
edi           0xd0c6bdd9        acx100_txpower_maxim+0x11632
esi                    0
ebp           0xf6017f70
ebx           0xf5537624
edx                0x3fd
ecx           0x9f839e0c
eax                 0x34
eip           0xd04d1684        db_enter+0x4
cs                   0x8
eflags             0x202
esp           0xf6017f70
ss                  0x10
db_enter+0x4:   popl    %ebp

ddb{1}> ps
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
*70054  518810   6287      0  7  0x18000002                perl
  6287  437197  75914      0  3   0x810008a  sigsusp       sh
 75914  475347  16112      0  3  0x1810008a  sigsusp       make
 16112  269880  88646      0  3   0x810008a  sigsusp       sh
  7372  241566      0      0  3     0x14200  bored         sosplice
 88646   48939  47959      0  3  0x1810008a  sigsusp       make
 79489   54558  36952      0  3  0x18100082  piperd        gzip
 36952  394482  47959      0  3   0x8100082  piperd        pax
 47959  163299  74133      0  3  0x18000082  piperd        perl
 74133  373192   9311      0  3   0x810008a  sigsusp       ksh
  9311  422502  68234      0  3  0x18000098  kqread        sshd-session
 68234  128656  64665      0  3  0x18000092  kqread        sshd-session
 55010  169129      1      0  3  0x18100083  ttyin         getty
 21292  215376      1      0  3  0x18100083  ttyin         getty
 77752  313340      1      0  3  0x18100083  ttyin         getty
 54794  190470      1      0  3  0x18100083  ttyin         getty
 75108  399121      1      0  3  0x18100083  ttyin         getty
 71474  495094      1      0  3  0x18100083  ttyin         getty
 90381  254323      1      0  3  0x18100098  kqread        cron
 60267  279114      1     99  3  0x19100090  kqread        sndiod
 57918  143188      1    110  3  0x18100090  kqread        sndiod
 70082  514883  80485     95  3  0x19100092  kqread        smtpd
 49077  259407  80485    103  3  0x19100092  kqread        smtpd
 21400  393556  80485     95  3  0x19100092  kqread        smtpd
 12451    3055  80485     95  3  0x18100092  kqread        smtpd
 66040  475731  80485     95  3  0x19100092  kqread        smtpd
 85842  293491  80485     95  3  0x19100092  kqread        smtpd
 80485  501146      1      0  3  0x18100080  kqread        smtpd
 19222  375233  24870     91  3  0x18000092  kqread        snmpd_metrics
 98568  276831  24870     91  3  0x19100092  kqread        snmpd
 24870   97973      1      0  3  0x18100080  kqread        snmpd
 64665  240144      1      0  3  0x18000088  kqread        sshd
 10838  392048      0      0  3     0x14200  acct          acct
  2796  438359      0      0  3     0x14280  nfsidl        nfsio
 70630  150278      0      0  3     0x14280  nfsidl        nfsio
 66783  372686      0      0  3     0x14280  nfsidl        nfsio
 61143  413391      0      0  3     0x14280  nfsidl        nfsio
 69255  424357      1      0  3  0x18100080  kqread        ntpd
 26443  206969  44938     83  3  0x18100092  kqread        ntpd
 44938  423998      1     83  3  0x19100092  kqread        ntpd
 47289  235673  26855     74  3  0x19100092  bpf           pflogd
 26855  198014      1      0  3  0x18000080  sbwait        pflogd
 15087  400620  37960     73  3  0x19100090  kqread        syslogd
 37960  496260      1      0  3  0x18100082  sbwait        syslogd
 14125  350848  38418     77  3  0x18100092  kqread        dhcpleased
 22940  181885  38418     77  3  0x18100092  kqread        dhcpleased
 38418  478101      1      0  3  0x18000080  kqread        dhcpleased
 36279  475992    621    115  3  0x18100092  kqread        slaacd
 22685  403716    621    115  3  0x18100092  kqread        slaacd
   621   96825      1      0  3  0x18100080  kqread        slaacd
 33437  252448      0      0  3     0x14200  bored         smr
 82784   20232      0      0  3     0x14200  pgzero        zerothread
 99575  154226      0      0  3     0x14200  aiodoned      aiodoned
 38234  446220      0      0  3     0x14200  syncer        update
 10140  113952      0      0  3     0x14200  cleaner       cleaner
 74682  353433      0      0  3     0x14200  reaper        reaper
 47178  433325      0      0  3     0x14200  pgdaemon      pagedaemon
 47605   16220      0      0  3     0x14200  bored         wsdisplay0
 98446  398209      0      0  3     0x14200  usbtsk        usbtask
 74779  207873      0      0  3     0x14200  usbatsk       usbatsk
 27412  284451      0      0  3     0x14200  bored         sensors
 94001   57270      0      0  3  0x40014200  acpi0         acpi0
 42423  327336      0      0  7  0x40014200                idle7
 89543  490248      0      0  7  0x40014200                idle6
  1491  239642      0      0  7  0x40014200                idle5
 68189  321989      0      0  7  0x40014200                idle4
 82999  132562      0      0  7  0x40014200                idle3
 92964  379359      0      0  7  0x40014200                idle2
 14916  466902      0      0  3  0x40014200                idle1
 92848  439672      0      0  3     0x14200  bored         softnet3
  7985  174055      0      0  3     0x14200  bored         softnet2
 23344  138203      0      0  3     0x14200  bored         softnet1
 29716  347099      0      0  3     0x14200  bored         softnet0
 68566  270200      0      0  3     0x14200  bored         systqmp
 20550  486107      0      0  3     0x14200  bored         systq
 85119   19194      0      0  3     0x14200  tmoslp        softclockmp
 25863  361548      0      0  3  0x40014200  tmoslp        softclock
 31920  260228      0      0  7  0x40014200                idle0
 39418    2809      0      0  3     0x14200  kmalloc       kmthread
     1  218001      0      0  3   0x8000082  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper

ddb{1}> x/s version
version:        OpenBSD 7.5-current (GENERIC.MP) #0: Sun Jul 14 10:22:41 CEST 2024\012    root@ot1.obsd-lab.genua.de:/usr/src/sys/arch/i386/compile/GENERIC.MP\012

00000450 <solock>:
solock+0x8 == 0x458
solock():
/usr/src/sys/kern/uipc_socket2.c:350
     450:       55                      push   %ebp
     451:       89 e5                   mov    %esp,%ebp
     453:       8b 45 08                mov    0x8(%ebp),%eax
/usr/src/sys/kern/uipc_socket2.c:351
     456:       8b 08                   mov    (%eax),%ecx
*    458:       8b 49 04                mov    0x4(%ecx),%ecx
     45b:       8b 11                   mov    (%ecx),%edx
     45d:       b9 00 00 00 00          mov    $0x0,%ecx
     462:       83 fa 02                cmp    $0x2,%edx
     465:       74 0a                   je     471 <solock+0x21>
     467:       83 fa 18                cmp    $0x18,%edx
     46a:       74 05                   je     471 <solock+0x21>
/usr/src/sys/kern/uipc_socket2.c:357

kern/uipc_socket2.c
   348  void
   349  solock(struct socket *so)
   350  {
*  351          switch (so->so_proto->pr_domain->dom_family) {
   352          case PF_INET:
   353          case PF_INET6:
   354                  NET_LOCK();
   355                  break;
   356          default:
   357                  rw_enter_write(&so->so_lock);
   358                  break;
   359          }
   360  }

eax is 0x34, so I would say pointer to socket *so is invalid.

00000ce0 <soclose>:
soclose+0x24b == 0xf2b
soclose():
...
/usr/src/sys/kern/uipc_socket.c:497
     f25:       57                      push   %edi
     f26:       e8 fc ff ff ff          call   f27 <soclose+0x247>
*    f2b:       83 c4 04                add    $0x4,%esp
/usr/src/sys/kern/uipc_socket.c:498

kern/uipc_socket.c
   492                  if (issplicedback(so)) {
   493                          int freeing = SOSP_FREEING_WRITE;
   494
   495                          if (so->so_sp->ssp_soback == so)
   496                                  freeing |= SOSP_FREEING_READ;
*  497                          solock(so2);
   498                          sounsplice(so->so_sp->ssp_soback, so, freeing);
   499                          sounlock(so2);
   500                  }

It was called from here.

bluhm