From: Vitaliy Makkoveev Subject: Re: ip sysctl atomic To: Alexander Bluhm Cc: tech@openbsd.org Date: Mon, 1 Jul 2024 22:09:37 +0300 Hi, On Tue, Apr 30, 2024 at 08:46:48PM +0200, Alexander Bluhm wrote: > Hi, > > As general MP safe sysctls are currently not wanted, let's implement > it the other way around. > mpi@ fixed pmap_pinit_pd_pae() in rev 1.72 of arch/i386/i386/pmapae.c. This problem was exposed with my sysctl(2) unlocking diff, so it was reverted. Now I'm successfully running dpb(1) on i386 machine provided by Hrvoje since May 22, so, I want to put it back. > Access to global integer variables in net.inet.ip is made MP safe > with read once. Some values are passed from ip_input() to ip_fragment() > with flags. This keeps the view of IP processing consistent even > if another CPU changes the sysctl value. > > I kept the sysctl code as it is, but added _mpsafe functions for > values that are used by multiple CPUs. > Current locked sysctl(2) wires userland pages to avoid context switch caused by copyin()/copyout(), so the current code is not applicable for unlocking. switch (name[0]) { case UDPCTL_BADDYNAMIC: NET_LOCK(); error = sysctl_struct(oldp, oldlenp, newp, newlen, baddynamicports.udp, sizeof(baddynamicports.udp)); NET_UNLOCK(); return (error); /* [skip] */ default: NET_LOCK(); error = sysctl_bounded_arr(udpctl_vars, nitems(udpctl_vars), name, namelen, oldp, oldlenp, newp, newlen); NET_UNLOCK(); return (error); } However, exclusively netlocked read-only access to integers is also bad for obvious reason. I don't think that unlocked modification is good, so I think to push desired lock down to sysctl_*() routines. The case of complicated data structures could require special lock for reading. Regardless on sysctl(2) unlocking, dedicated locks could be useful to avoid page wiring in most cases, because we need only to protect access to data structures, the copyin()/copyout() could be moved outside critical section. This is pseudo code, not the diff. int sysctl_int_bounded(void *oldp, size_t *oldlenp, void *newp, size_t newlen, int *valp, int minimum, int maximum, void (*lock)(void), void (*unlock)(void)) { int val = *valp; int error; /* read only */ if (newp == NULL || minimum > maximum) return (sysctl_rdint(oldp, oldlenp, newp, val)); if ((error = sysctl_int(oldp, oldlenp, newp, newlen, &val))) return (error); /* outside limits */ if (val < minimum || maximum < val) return (EINVAL); if (lock) (*lock)(); *valp = val; if (lock) (*unlock)(); return (0); } void udp_sysctl_lock(void) { NET_LOCK(); } void udp_sysctl_unlock(void) { NET_UNLOCK(); } const struct sysctl_bounded_args udpctl_vars[] = { { UDPCTL_CHECKSUM, &udpcksum, 0, 1, udp_sysctl_lock, udp_sysctl_unlock }, { UDPCTL_RECVSPACE, &udp_recvspace, 0, INT_MAX }, udp_sysctl_lock, udp_sysctl_unlock }, { UDPCTL_SENDSPACE, &udp_sendspace, 0, INT_MAX }, udp_sysctl_lock, udp_sysctl_unlock }, }; switch (name[0]) { case UDPCTL_BADDYNAMIC: NET_LOCK(); error = sysctl_struct(oldp, oldlenp, newp, newlen, baddynamicports.udp, sizeof(baddynamicports.udp)); NET_UNLOCK(); return (error); /* [skip] */ default: return sysctl_bounded_arr(udpctl_vars, nitems(udpctl_vars), name, namelen, oldp, oldlenp, newp, newlen); }