Index | Thread | Search

From:
Alexander Bluhm <bluhm@openbsd.org>
Subject:
Re: ip sysctl atomic
To:
tech@openbsd.org
Date:
Fri, 17 May 2024 21:28:26 +0200

Download raw body.

Thread
  • Vitaliy Makkoveev:

    ip sysctl atomic

  • On Fri, May 17, 2024 at 08:45:30PM +0200, Claudio Jeker wrote:
    > On Thu, May 16, 2024 at 09:42:01PM +0200, Alexander Bluhm wrote:
    > > On Thu, May 16, 2024 at 02:52:29PM +0200, Mark Kettenis wrote:
    > > > > You write it once into common memory and you read it
    > > > > once.  Thinking every time about the context whether it is safe or
    > > > > not to rely on the compiler is too difficult for me.
    > > > 
    > > > I feel your pain. The compilers make it pretty much impossible to
    > > > reason about these things.  But we can't wrap *every* global variables
    > > > access in the kernel in READ_ONCE() or WRITE_ONCE.  The code will
    > > > become unreadable.
    > > 
    > > When making code MP safe it feels better to mark every access
    > > explicitly.  But in the long run too many macros may be confusing.
    > 
    > The problem here is that it lures you into thinking this is safe but in
    > fact it probably isn't.
    
    It depends what you call "safe" in the context.
    
    - just access it and let the compiler do crazy things
    - use volatile to avoid compiler optimizations
    - memory barriers for cache coherencey and microcode reordering
    - atomic operations to coordinate multiple CPUs
    - lock and mutex that do all of the things above
    
    > > > I wonder if you should just declare all those global variables that
    > > > you access with READ_ONCE() should be just declared volatile.
    > > 
    > > I can live with this solution, diff is below:
    > > - Mark integer in sysctl_int_bounded volatile.
    > > - Declare globals volatile.
    > > - Use consistent ip_ prefix for global variables.
    > > - Add some locking comments.
    > > - in IP input, forward, output use flags for consistent view
    > > 
    > > ok?
    > 
    > I really dislike this diff. It adds volatile to things that really don't
    > need it and it does not fix the underlying issue that we want a consitent
    > view of the globals during packet delivery.
    
    Then we keep exclusive net lock in sysctl.  It guarantees that nothing
    changes while a packet is processed.
    
    > e.g. when a function checks ip_forwarding and then calls a 2nd function
    > which also checks ip_forwarding then you can't ensure that both see the
    > same value. This can be a very nasty footgun.
    
    This is why I pass flags.  I think the other sysctl integers are
    independent.  But who knows, only net lock has no risk.  Everything
    else needs manual inspection of the packet path.
    
    > Now if that doesn't matter then there is no need to make the globals
    > volatile and to do anything complicated in sysctl. Just change the
    > value. Maybe issue a memory barrier to flush out the write so other CPUs
    > see the new value ASAP but there is no need for volatile hacks.
    
    Just reading a value on one CPU which can be changed by another CPU
    is wrong.  Due to compiler optimizations that can end up in multiple
    inconsistent reads.  kettenis@ suggested volatile against this, I
    wanted READ_ONCE().  Except style it is the same.
    
    Memory barriers are not needed and were rejected by kettenis@.
    
    I don't want atomic operations as they are not needed and are slower
    than necessary.
    
    > I know you changed the ip_forwarding to use local flags / state instead
    > which is the right approach to make this less nasty but your diff is
    > changing too many values at once to keep track of where they are used and
    > how. Again doing it this way does not require volatile.
    
    I can write the diff a fourth time.  But I would like to know in
    advance what part could be accepted.
    
    bluhm
    
    > > Index: kern/kern_sysctl.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_sysctl.c,v
    > > diff -u -p -r1.427 kern_sysctl.c
    > > --- kern/kern_sysctl.c	12 Apr 2024 16:07:09 -0000	1.427
    > > +++ kern/kern_sysctl.c	16 May 2024 19:22:05 -0000
    > > @@ -1003,21 +1003,36 @@ sysctl_securelevel_int(void *oldp, size_
    > >   */
    > >  int
    > >  sysctl_int_bounded(void *oldp, size_t *oldlenp, void *newp, size_t newlen,
    > > -    int *valp, int minimum, int maximum)
    > > +    volatile int *valp, int minimum, int maximum)
    > >  {
    > > -	int val = *valp;
    > > +	int oldval, newval;
    > >  	int error;
    > >  
    > >  	/* read only */
    > > -	if (newp == NULL || minimum > maximum)
    > > -		return (sysctl_rdint(oldp, oldlenp, newp, val));
    > > +	if (newp != NULL && minimum > maximum)
    > > +		return (EPERM);
    > >  
    > > -	if ((error = sysctl_int(oldp, oldlenp, newp, newlen, &val)))
    > > -		return (error);
    > > -	/* outside limits */
    > > -	if (val < minimum || maximum < val)
    > > +	if (oldp && *oldlenp < sizeof(int))
    > > +		return (ENOMEM);
    > > +	if (newp && newlen != sizeof(int))
    > >  		return (EINVAL);
    > > -	*valp = val;
    > > +	*oldlenp = sizeof(int);
    > > +
    > > +	/* copyin() may sleep, call it first */
    > > +	if (newp) {
    > > +		if ((error = copyin(newp, &newval, sizeof(int))))
    > > +			return (error);
    > > +		/* outside limits */
    > > +		if (newval < minimum || maximum < newval)
    > > +			return (EINVAL);
    > > +	}
    > > +	if (oldp) {
    > > +		oldval = *valp;
    > > +		if ((error = copyout(&oldval, oldp, sizeof(int))))
    > > +			return (error);
    > > +	}
    > > +	if (newp)
    > > +		*valp = newval;
    > >  	return (0);
    > >  }
    > >  
    > > Index: net/pf.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf.c,v
    > > diff -u -p -r1.1196 pf.c
    > > --- net/pf.c	14 May 2024 08:26:13 -0000	1.1196
    > > +++ net/pf.c	16 May 2024 19:22:05 -0000
    > > @@ -7958,12 +7958,17 @@ done:
    > >  		switch (pd.naf) {
    > >  		case AF_INET:
    > >  			if (pd.dir == PF_IN) {
    > > -				if (ipforwarding == 0) {
    > > +				int flags;
    > > +
    > > +				if (ip_forwarding == 0) {
    > >  					ipstat_inc(ips_cantforward);
    > >  					action = PF_DROP;
    > >  					break;
    > >  				}
    > > -				ip_forward(pd.m, ifp, NULL, 1);
    > > +				flags = IP_FORWARDING | IP_REDIRECT;
    > > +				if (ip_directedbcast)
    > > +					SET(flags, IP_ALLOWBROADCAST);
    > > +				ip_forward(pd.m, ifp, NULL, flags);
    > >  			} else
    > >  				ip_output(pd.m, NULL, NULL, 0, NULL, NULL, 0);
    > >  			break;
    > > Index: netinet/if_ether.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/if_ether.c,v
    > > diff -u -p -r1.267 if_ether.c
    > > --- netinet/if_ether.c	18 Dec 2023 13:30:44 -0000	1.267
    > > +++ netinet/if_ether.c	16 May 2024 19:22:05 -0000
    > > @@ -69,6 +69,7 @@
    > >   *  Locks used to protect struct members in this file:
    > >   *	a	atomic operations
    > >   *	I	immutable after creation
    > > + *	u	unlocked integer access
    > >   *	K	kernel lock
    > >   *	m	arp mutex, needed when net lock is shared
    > >   *	N	net lock
    > > @@ -86,8 +87,8 @@ struct llinfo_arp {
    > >  
    > >  /* timer values */
    > >  int 	arpt_prune = (5 * 60);	/* [I] walk list every 5 minutes */
    > > -int 	arpt_keep = (20 * 60);	/* [a] once resolved, cache for 20 minutes */
    > > -int 	arpt_down = 20;	/* [a] once declared down, don't send for 20 secs */
    > > +volatile int arpt_keep = (20 * 60); /* [u] once resolved, cache 20 minutes */
    > > +volatile int arpt_down = 20;	/* [u] once declared down, don't send 20 secs */
    > >  
    > >  struct mbuf *arppullup(struct mbuf *m);
    > >  void arpinvalidate(struct rtentry *);
    > > Index: netinet/if_ether.h
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/if_ether.h,v
    > > diff -u -p -r1.92 if_ether.h
    > > --- netinet/if_ether.h	14 Feb 2024 22:41:48 -0000	1.92
    > > +++ netinet/if_ether.h	16 May 2024 19:22:05 -0000
    > > @@ -245,8 +245,8 @@ struct	arpcom {
    > >  	const struct ether_brport *ac_brport;
    > >  };
    > >  
    > > -extern int arpt_keep;				/* arp resolved cache expire */
    > > -extern int arpt_down;				/* arp down cache expire */
    > > +extern volatile int arpt_keep;		/* arp resolved cache expire */
    > > +extern volatile int arpt_down;		/* arp down cache expire */
    > >  
    > >  extern u_int8_t etherbroadcastaddr[ETHER_ADDR_LEN];
    > >  extern u_int8_t etheranyaddr[ETHER_ADDR_LEN];
    > > Index: netinet/in_pcb.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/in_pcb.c,v
    > > diff -u -p -r1.302 in_pcb.c
    > > --- netinet/in_pcb.c	19 Apr 2024 10:13:58 -0000	1.302
    > > +++ netinet/in_pcb.c	16 May 2024 19:22:05 -0000
    > > @@ -105,13 +105,18 @@ const struct in_addr zeroin_addr;
    > >  const union inpaddru zeroin46_addr;
    > >  
    > >  /*
    > > + * Locks used to protect global variables in this file:
    > > + *	u	unlocked integer access
    > > + */
    > > +
    > > +/*
    > >   * These configure the range of local port addresses assigned to
    > >   * "unspecified" outgoing connections/packets/whatever.
    > >   */
    > > -int ipport_firstauto = IPPORT_RESERVED;
    > > -int ipport_lastauto = IPPORT_USERRESERVED;
    > > -int ipport_hifirstauto = IPPORT_HIFIRSTAUTO;
    > > -int ipport_hilastauto = IPPORT_HILASTAUTO;
    > > +volatile int ipport_firstauto = IPPORT_RESERVED;	/* [u] */
    > > +volatile int ipport_lastauto = IPPORT_USERRESERVED;	/* [u] */
    > > +volatile int ipport_hifirstauto = IPPORT_HIFIRSTAUTO;	/* [u] */
    > > +volatile int ipport_hilastauto = IPPORT_HILASTAUTO;	/* [u] */
    > >  
    > >  struct baddynamicports baddynamicports;
    > >  struct baddynamicports rootonlyports;
    > > Index: netinet/ip_icmp.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_icmp.c,v
    > > diff -u -p -r1.192 ip_icmp.c
    > > --- netinet/ip_icmp.c	16 Sep 2023 09:33:27 -0000	1.192
    > > +++ netinet/ip_icmp.c	16 May 2024 19:22:05 -0000
    > > @@ -589,7 +589,7 @@ reflect:
    > >  		struct sockaddr_in ssrc;
    > >  		struct rtentry *newrt = NULL;
    > >  
    > > -		if (icmp_rediraccept == 0 || ipforwarding == 1)
    > > +		if (icmp_rediraccept == 0 || ip_forwarding == 1)
    > >  			goto freeit;
    > >  		if (code > 3)
    > >  			goto badcode;
    > > Index: netinet/ip_input.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_input.c,v
    > > diff -u -p -r1.394 ip_input.c
    > > --- netinet/ip_input.c	8 May 2024 13:01:30 -0000	1.394
    > > +++ netinet/ip_input.c	16 May 2024 19:22:05 -0000
    > > @@ -83,33 +83,41 @@
    > >  #include <netinet/ip_carp.h>
    > >  #endif
    > >  
    > > +/*
    > > + * Locks used to protect global variables in this file:
    > > + *	I	immutable after creation
    > > + *	a	atomic operations
    > > + *	u	unlocked integer access
    > > + *	N	net lock
    > > + *	F	fragment reassembly ipq_mutex
    > > + */
    > > +
    > >  /* values controllable via sysctl */
    > > -int	ipforwarding = 0;
    > > -int	ipmforwarding = 0;
    > > -int	ipmultipath = 0;
    > > -int	ipsendredirects = 1;
    > > +volatile int	ip_forwarding = 0;		/* [u] */
    > > +volatile int	ip_mforwarding = 0;		/* [u] */
    > > +int	ipmultipath = 0;			/* [N] */
    > > +volatile int	ip_sendredirects = 1;		/* [u] */
    > >  int	ip_dosourceroute = 0;
    > > -int	ip_defttl = IPDEFTTL;
    > > +volatile int	ip_defttl = IPDEFTTL;		/* [u] */
    > >  int	ip_mtudisc = 1;
    > >  int	ip_mtudisc_timeout = IPMTUDISCTIMEOUT;
    > > -int	ip_directedbcast = 0;
    > > +volatile int	ip_directedbcast = 0;		/* [u] */
    > >  
    > >  /* Protects `ipq' and `ip_frags'. */
    > >  struct mutex	ipq_mutex = MUTEX_INITIALIZER(IPL_SOFTNET);
    > >  
    > > -/* IP reassembly queue */
    > > -LIST_HEAD(, ipq) ipq;
    > > +LIST_HEAD(, ipq) ipq;			/* [F] reassembly queue */
    > >  
    > >  /* Keep track of memory used for reassembly */
    > > -int	ip_maxqueue = 300;
    > > -int	ip_frags = 0;
    > > +volatile int	ip_maxqueue = 300;	/* [u] maximum fragments in queue */
    > > +int		ip_frags;		/* [F] number of fragments */
    > >  
    > >  const struct sysctl_bounded_args ipctl_vars[] = {
    > >  #ifdef MROUTING
    > >  	{ IPCTL_MRTPROTO, &ip_mrtproto, SYSCTL_INT_READONLY },
    > >  #endif
    > > -	{ IPCTL_FORWARDING, &ipforwarding, 0, 2 },
    > > -	{ IPCTL_SENDREDIRECTS, &ipsendredirects, 0, 1 },
    > > +	{ IPCTL_FORWARDING, &ip_forwarding, 0, 2 },
    > > +	{ IPCTL_SENDREDIRECTS, &ip_sendredirects, 0, 1 },
    > >  	{ IPCTL_DEFTTL, &ip_defttl, 0, 255 },
    > >  	{ IPCTL_DIRECTEDBCAST, &ip_directedbcast, 0, 1 },
    > >  	{ IPCTL_IPPORT_FIRSTAUTO, &ipport_firstauto, 0, 65535 },
    > > @@ -117,7 +125,7 @@ const struct sysctl_bounded_args ipctl_v
    > >  	{ IPCTL_IPPORT_HIFIRSTAUTO, &ipport_hifirstauto, 0, 65535 },
    > >  	{ IPCTL_IPPORT_HILASTAUTO, &ipport_hilastauto, 0, 65535 },
    > >  	{ IPCTL_IPPORT_MAXQUEUE, &ip_maxqueue, 0, 10000 },
    > > -	{ IPCTL_MFORWARDING, &ipmforwarding, 0, 1 },
    > > +	{ IPCTL_MFORWARDING, &ip_mforwarding, 0, 1 },
    > >  	{ IPCTL_ARPTIMEOUT, &arpt_keep, 0, INT_MAX },
    > >  	{ IPCTL_ARPDOWN, &arpt_down, 0, INT_MAX },
    > >  };
    > > @@ -137,8 +145,8 @@ static struct mbuf_queue	ipsendraw_mq;
    > >  extern struct niqueue		arpinq;
    > >  
    > >  int	ip_ours(struct mbuf **, int *, int, int);
    > > -int	ip_dooptions(struct mbuf *, struct ifnet *);
    > > -int	in_ouraddr(struct mbuf *, struct ifnet *, struct route *);
    > > +int	ip_dooptions(struct mbuf *, struct ifnet *, int);
    > > +int	in_ouraddr(struct mbuf *, struct ifnet *, struct route *, int);
    > >  
    > >  int		ip_fragcheck(struct mbuf **, int *);
    > >  struct mbuf *	ip_reass(struct ipqent *, struct ipq *);
    > > @@ -431,7 +439,7 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  #if NPF > 0
    > >  	struct in_addr odst;
    > >  #endif
    > > -	int pfrdr = 0;
    > > +	int flags = 0;
    > >  
    > >  	KASSERT(*offp == 0);
    > >  
    > > @@ -461,9 +469,15 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  		goto bad;
    > >  
    > >  	ip = mtod(m, struct ip *);
    > > -	pfrdr = odst.s_addr != ip->ip_dst.s_addr;
    > > +	if (odst.s_addr != ip->ip_dst.s_addr)
    > > +		SET(flags, IP_REDIRECT);
    > >  #endif
    > >  
    > > +	if (ip_forwarding != 0)
    > > +		SET(flags, IP_FORWARDING);
    > > +	if (ip_directedbcast)
    > > +		SET(flags, IP_ALLOWBROADCAST);
    > > +
    > >  	hlen = ip->ip_hl << 2;
    > >  
    > >  	/*
    > > @@ -472,7 +486,7 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  	 * error was detected (causing an icmp message
    > >  	 * to be sent and the original packet to be freed).
    > >  	 */
    > > -	if (hlen > sizeof (struct ip) && ip_dooptions(m, ifp)) {
    > > +	if (hlen > sizeof (struct ip) && ip_dooptions(m, ifp, flags)) {
    > >  		m = *mp = NULL;
    > >  		goto bad;
    > >  	}
    > > @@ -483,7 +497,7 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  		goto out;
    > >  	}
    > >  
    > > -	switch(in_ouraddr(m, ifp, &ro)) {
    > > +	switch(in_ouraddr(m, ifp, &ro, flags)) {
    > >  	case 2:
    > >  		goto bad;
    > >  	case 1:
    > > @@ -500,7 +514,7 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  		m->m_flags |= M_MCAST;
    > >  
    > >  #ifdef MROUTING
    > > -		if (ipmforwarding && ip_mrouter[ifp->if_rdomain]) {
    > > +		if (ip_mforwarding && ip_mrouter[ifp->if_rdomain]) {
    > >  			int error;
    > >  
    > >  			if (m->m_flags & M_EXT) {
    > > @@ -565,7 +579,7 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  	/*
    > >  	 * Not for us; forward if possible and desirable.
    > >  	 */
    > > -	if (ipforwarding == 0) {
    > > +	if (!ISSET(flags, IP_FORWARDING)) {
    > >  		ipstat_inc(ips_cantforward);
    > >  		goto bad;
    > >  	}
    > > @@ -585,7 +599,7 @@ ip_input_if(struct mbuf **mp, int *offp,
    > >  	}
    > >  #endif /* IPSEC */
    > >  
    > > -	ip_forward(m, ifp, &ro, pfrdr);
    > > +	ip_forward(m, ifp, &ro, flags);
    > >  	*mp = NULL;
    > >  	rtfree(ro.ro_rt);
    > >  	return IPPROTO_DONE;
    > > @@ -807,7 +821,7 @@ ip_deliver(struct mbuf **mp, int *offp, 
    > >  #undef IPSTAT_INC
    > >  
    > >  int
    > > -in_ouraddr(struct mbuf *m, struct ifnet *ifp, struct route *ro)
    > > +in_ouraddr(struct mbuf *m, struct ifnet *ifp, struct route *ro, int flags)
    > >  {
    > >  	struct rtentry		*rt;
    > >  	struct ip		*ip;
    > > @@ -837,7 +851,8 @@ in_ouraddr(struct mbuf *m, struct ifnet 
    > >  		 * if it is received on the interface with that address.
    > >  		 */
    > >  		if (ISSET(rt->rt_flags, RTF_BROADCAST) &&
    > > -		    (!ip_directedbcast || rt->rt_ifidx == ifp->if_index)) {
    > > +		    (!ISSET(flags, IP_ALLOWBROADCAST) ||
    > > +		    rt->rt_ifidx == ifp->if_index)) {
    > >  			match = 1;
    > >  
    > >  			/* Make sure M_BCAST is set */
    > > @@ -876,7 +891,8 @@ in_ouraddr(struct mbuf *m, struct ifnet 
    > >  				break;
    > >  			}
    > >  		}
    > > -	} else if (ipforwarding == 0 && rt->rt_ifidx != ifp->if_index &&
    > > +	} else if (!ISSET(flags, IP_FORWARDING) &&
    > > +	    rt->rt_ifidx != ifp->if_index &&
    > >  	    !((ifp->if_flags & IFF_LOOPBACK) || (ifp->if_type == IFT_ENC) ||
    > >  	    (m->m_pkthdr.pf.flags & PF_TAG_TRANSLATE_LOCALHOST))) {
    > >  		/* received on wrong interface. */
    > > @@ -1132,11 +1148,13 @@ ip_slowtimo(void)
    > >  void
    > >  ip_flush(void)
    > >  {
    > > -	int max = 50;
    > > +	int limit, max = 50;
    > >  
    > >  	MUTEX_ASSERT_LOCKED(&ipq_mutex);
    > >  
    > > -	while (!LIST_EMPTY(&ipq) && ip_frags > ip_maxqueue * 3 / 4 && --max) {
    > > +	limit = ip_maxqueue * 3 / 4;
    > > +
    > > +	while (!LIST_EMPTY(&ipq) && ip_frags > limit && --max) {
    > >  		ipstat_inc(ips_fragdropped);
    > >  		ip_freef(LIST_FIRST(&ipq));
    > >  	}
    > > @@ -1150,7 +1168,7 @@ ip_flush(void)
    > >   * 0 if the packet should be processed further.
    > >   */
    > >  int
    > > -ip_dooptions(struct mbuf *m, struct ifnet *ifp)
    > > +ip_dooptions(struct mbuf *m, struct ifnet *ifp, int flags)
    > >  {
    > >  	struct ip *ip = mtod(m, struct ip *);
    > >  	unsigned int rtableid = m->m_pkthdr.ph_rtableid;
    > > @@ -1371,8 +1389,8 @@ ip_dooptions(struct mbuf *m, struct ifne
    > >  		}
    > >  	}
    > >  	KERNEL_UNLOCK();
    > > -	if (forward && ipforwarding > 0) {
    > > -		ip_forward(m, ifp, NULL, 1);
    > > +	if (forward && ISSET(flags, IP_FORWARDING)) {
    > > +		ip_forward(m, ifp, NULL, flags | IP_REDIRECT);
    > >  		return (1);
    > >  	}
    > >  	return (0);
    > > @@ -1514,7 +1532,7 @@ const u_char inetctlerrmap[PRC_NCMDS] = 
    > >   * of codes and types.
    > >   *
    > >   * If not forwarding, just drop the packet.  This could be confusing
    > > - * if ipforwarding was zero but some routing protocol was advancing
    > > + * if ip_forwarding was zero but some routing protocol was advancing
    > >   * us as a gateway to somewhere.  However, we must let the routing
    > >   * protocol deal with that.
    > >   *
    > > @@ -1522,7 +1540,7 @@ const u_char inetctlerrmap[PRC_NCMDS] = 
    > >   * via a source route.
    > >   */
    > >  void
    > > -ip_forward(struct mbuf *m, struct ifnet *ifp, struct route *ro, int srcrt)
    > > +ip_forward(struct mbuf *m, struct ifnet *ifp, struct route *ro, int flags)
    > >  {
    > >  	struct mbuf mfake, *mcopy;
    > >  	struct ip *ip = mtod(m, struct ip *);
    > > @@ -1588,7 +1606,7 @@ ip_forward(struct mbuf *m, struct ifnet 
    > >  	if ((rt->rt_ifidx == ifp->if_index) &&
    > >  	    (rt->rt_flags & (RTF_DYNAMIC|RTF_MODIFIED)) == 0 &&
    > >  	    satosin(rt_key(rt))->sin_addr.s_addr != 0 &&
    > > -	    ipsendredirects && !srcrt &&
    > > +	    ip_sendredirects && !ISSET(flags, IP_REDIRECT) &&
    > >  	    !arpproxy(satosin(rt_key(rt))->sin_addr, m->m_pkthdr.ph_rtableid)) {
    > >  		if ((ip->ip_src.s_addr & ifatoia(rt->rt_ifa)->ia_netmask) ==
    > >  		    ifatoia(rt->rt_ifa)->ia_net) {
    > > @@ -1602,9 +1620,7 @@ ip_forward(struct mbuf *m, struct ifnet 
    > >  		}
    > >  	}
    > >  
    > > -	error = ip_output(m, NULL, ro,
    > > -	    (IP_FORWARDING | (ip_directedbcast ? IP_ALLOWBROADCAST : 0)),
    > > -	    NULL, NULL, 0);
    > > +	error = ip_output(m, NULL, ro, flags | IP_FORWARDING, NULL, NULL, 0);
    > >  	rt = ro->ro_rt;
    > >  	if (error)
    > >  		ipstat_inc(ips_cantforward);
    > > @@ -1781,10 +1797,8 @@ ip_sysctl(int *name, u_int namelen, void
    > >  		NET_UNLOCK();
    > >  		return (error);
    > >  	default:
    > > -		NET_LOCK();
    > >  		error = sysctl_bounded_arr(ipctl_vars, nitems(ipctl_vars),
    > >  		    name, namelen, oldp, oldlenp, newp, newlen);
    > > -		NET_UNLOCK();
    > >  		return (error);
    > >  	}
    > >  	/* NOTREACHED */
    > > Index: netinet/ip_ipip.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_ipip.c,v
    > > diff -u -p -r1.101 ip_ipip.c
    > > --- netinet/ip_ipip.c	11 Feb 2024 01:27:45 -0000	1.101
    > > +++ netinet/ip_ipip.c	16 May 2024 19:22:05 -0000
    > > @@ -481,7 +481,7 @@ ipip_output(struct mbuf **mp, struct tdb
    > >  		ip6o->ip6_vfc &= ~IPV6_VERSION_MASK;
    > >  		ip6o->ip6_vfc |= IPV6_VERSION;
    > >  		ip6o->ip6_plen = htons(m->m_pkthdr.len - sizeof(*ip6o));
    > > -		ip6o->ip6_hlim = ip_defttl;
    > > +		ip6o->ip6_hlim = ip6_defhlim;
    > >  		in6_embedscope(&ip6o->ip6_src, &tdb->tdb_src.sin6, NULL, NULL);
    > >  		in6_embedscope(&ip6o->ip6_dst, &tdb->tdb_dst.sin6, NULL, NULL);
    > >  
    > > Index: netinet/ip_mroute.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_mroute.c,v
    > > diff -u -p -r1.142 ip_mroute.c
    > > --- netinet/ip_mroute.c	6 Apr 2024 14:23:27 -0000	1.142
    > > +++ netinet/ip_mroute.c	16 May 2024 19:22:05 -0000
    > > @@ -98,7 +98,7 @@ int mcast_debug = 1;
    > >  struct socket	*ip_mrouter[RT_TABLEID_MAX + 1];
    > >  struct rttimer_queue ip_mrouterq;
    > >  uint64_t	 mrt_count[RT_TABLEID_MAX + 1];
    > > -int		ip_mrtproto = IGMP_DVMRP;    /* for netstat only */
    > > +int		ip_mrtproto = IGMP_DVMRP;    /* [I] for netstat only */
    > >  
    > >  struct mrtstat	mrtstat;
    > >  
    > > Index: netinet/ip_output.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_output.c,v
    > > diff -u -p -r1.399 ip_output.c
    > > --- netinet/ip_output.c	16 May 2024 13:01:04 -0000	1.399
    > > +++ netinet/ip_output.c	16 May 2024 19:22:05 -0000
    > > @@ -326,7 +326,7 @@ reroute:
    > >  			 * above, will be forwarded by the ip_input() routine,
    > >  			 * if necessary.
    > >  			 */
    > > -			if (ipmforwarding && ip_mrouter[ifp->if_rdomain] &&
    > > +			if (ip_mforwarding && ip_mrouter[ifp->if_rdomain] &&
    > >  			    (flags & IP_FORWARDING) == 0) {
    > >  				int rv;
    > >  
    > > @@ -428,7 +428,7 @@ sendit:
    > >  #endif
    > >  
    > >  #ifdef IPSEC
    > > -	if ((flags & IP_FORWARDING) && ipforwarding == 2 &&
    > > +	if ((flags & IP_FORWARDING) && ip_forwarding == 2 &&
    > >  	    (!ipsec_in_use ||
    > >  	    m_tag_find(m, PACKET_TAG_IPSEC_IN_DONE, NULL) == NULL)) {
    > >  		error = EHOSTUNREACH;
    > > Index: netinet/ip_var.h
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_var.h,v
    > > diff -u -p -r1.117 ip_var.h
    > > --- netinet/ip_var.h	17 Apr 2024 20:48:51 -0000	1.117
    > > +++ netinet/ip_var.h	16 May 2024 19:22:05 -0000
    > > @@ -206,26 +206,26 @@ struct ipoffnxt {
    > >  /* flags passed to ip_output */
    > >  #define	IP_FORWARDING		0x1		/* most of ip header exists */
    > >  #define	IP_RAWOUTPUT		0x2		/* raw ip header exists */
    > > +#define	IP_REDIRECT		0x4	/* redirected by pf or source route */
    > >  #define	IP_ALLOWBROADCAST	SO_BROADCAST	/* can send broadcast packets */
    > >  #define	IP_MTUDISC		0x0800		/* pmtu discovery, set DF */
    > >  
    > >  extern struct ipstat ipstat;
    > > -extern int ip_defttl;			/* default IP ttl */
    > > +extern volatile int ip_defttl;		/* default IP ttl */
    > >  
    > >  #define IPMTUDISCTIMEOUT (10 * 60)	/* as per RFC 1191 */
    > >  
    > >  extern int ip_mtudisc;			/* mtu discovery */
    > >  extern int ip_mtudisc_timeout;		/* seconds to timeout mtu discovery */
    > >  
    > > -extern int ipport_firstauto;		/* min port for port allocation */
    > > -extern int ipport_lastauto;		/* max port for port allocation */
    > > -extern int ipport_hifirstauto;		/* min dynamic/private port number */
    > > -extern int ipport_hilastauto;		/* max dynamic/private port number */
    > > -extern int ipforwarding;		/* enable IP forwarding */
    > > -#ifdef MROUTING
    > > -extern int ipmforwarding;		/* enable multicast forwarding */
    > > -#endif
    > > +extern volatile int ipport_firstauto;	/* min port for port allocation */
    > > +extern volatile int ipport_lastauto;	/* max port for port allocation */
    > > +extern volatile int ipport_hifirstauto;	/* min dynamic/private port number */
    > > +extern volatile int ipport_hilastauto;	/* max dynamic/private port number */
    > > +extern volatile int ip_forwarding;	/* enable IP forwarding */
    > > +extern volatile int ip_mforwarding;	/* enable multicast forwarding */
    > >  extern int ipmultipath;			/* enable multipath routing */
    > > +extern volatile int ip_directedbcast;	/* accept all broadcast packets */
    > >  extern unsigned int la_hold_total;
    > >  
    > >  extern const struct pr_usrreqs rip_usrreqs;
    > > Index: netinet6/ip6_forward.c
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/ip6_forward.c,v
    > > diff -u -p -r1.117 ip6_forward.c
    > > --- netinet6/ip6_forward.c	16 Apr 2024 12:56:39 -0000	1.117
    > > +++ netinet6/ip6_forward.c	16 May 2024 19:22:05 -0000
    > > @@ -75,7 +75,7 @@
    > >   * of codes and types.
    > >   *
    > >   * If not forwarding, just drop the packet.  This could be confusing
    > > - * if ipforwarding was zero but some routing protocol was advancing
    > > + * if ip6_forwarding was zero but some routing protocol was advancing
    > >   * us as a gateway to somewhere.  However, we must let the routing
    > >   * protocol deal with that.
    > >   *
    > > Index: sys/sysctl.h
    > > ===================================================================
    > > RCS file: /data/mirror/openbsd/cvs/src/sys/sys/sysctl.h,v
    > > diff -u -p -r1.235 sysctl.h
    > > --- sys/sysctl.h	1 Oct 2023 15:58:12 -0000	1.235
    > > +++ sys/sysctl.h	16 May 2024 19:22:05 -0000
    > > @@ -1035,7 +1035,7 @@ struct ctldebug {
    > >   */
    > >  struct sysctl_bounded_args {
    > >  	int mib;     /* identifier shared with userspace as a CTL_ #define */
    > > -	int *var;    /* never NULL */
    > > +	volatile int *var;    /* never NULL */
    > >  	int minimum; /* checking is disabled if minimum == maximum  */
    > >  	int maximum; /* read-only variable if minimum > maximum */
    > >  };
    > > @@ -1058,7 +1058,8 @@ int sysctl_int_lower(void *, size_t *, v
    > >  int sysctl_int(void *, size_t *, void *, size_t, int *);
    > >  int sysctl_rdint(void *, size_t *, void *, int);
    > >  int sysctl_securelevel_int(void *, size_t *, void *, size_t, int *);
    > > -int sysctl_int_bounded(void *, size_t *, void *, size_t, int *, int, int);
    > > +int sysctl_int_bounded(void *, size_t *, void *, size_t, volatile int *,
    > > +    int, int);
    > >  int sysctl_bounded_arr(const struct sysctl_bounded_args *, u_int,
    > >      int *, u_int, void *, size_t *, void *, size_t);
    > >  int sysctl_quad(void *, size_t *, void *, size_t, int64_t *);
    > 
    > -- 
    > :wq Claudio
    
    
    
  • Vitaliy Makkoveev:

    ip sysctl atomic