From: Alexander Bluhm Subject: Re: keep socket lock in sonewconn To: Vitaliy Makkoveev Cc: tech@openbsd.org Date: Sun, 26 Jan 2025 00:10:55 +0100 On Sat, Jan 25, 2025 at 06:37:49PM +0300, Vitaliy Makkoveev wrote: > On Fri, Jan 24, 2025 at 10:02:09AM +0100, Alexander Bluhm wrote: > > Hi, > > > > For TCP input unlocking we need a consistent lock of a newly created > > socket. Instead of releasing the lock in sonewconn() and grabbing > > it again later, I think it is better that sonewconn() returns a > > locked socket. > > > > Currently I only changed syn_cache_get() which calls in_pcbsounlock_rele() > > at the end. Following diffs will push the unlock into tcp_input(). > > > > ok? > > UNIX sockets are much complicated in this path, because we need to > handle 3 sockets and keep right locking order between them. Also, at > his stage newly created UNIX socket linked to `so_q0' and accessible > only from garbage collector, which will not close it. > > However, this is not true for TCP socket, because connection is > already established and concurrent FIN or RST packet could start > demolition. > > ok mvs, but keep in mind locking order between sockets, we always lock > listening socket aka `head' first. Maybe we can simplify locking dance unp_connect() a bit. But I wanted to be carefull and did a minimal change. > Isn't it better to pass `oldso' to sonewconn()? Does the `head' looks > more reasonable here? > > > + oldso = so; > > + oldinp = inp; > > so = sonewconn(so, SS_ISCONNECTED, M_DONTWAIT); "old" is quite generic. TCP does not talk about "head". How about "listen" socket? ok? bluhm Index: netinet/tcp_input.c =================================================================== RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/tcp_input.c,v diff -u -p -r1.424 tcp_input.c --- netinet/tcp_input.c 25 Jan 2025 22:06:41 -0000 1.424 +++ netinet/tcp_input.c 25 Jan 2025 23:04:10 -0000 @@ -3532,10 +3532,10 @@ syn_cache_get(struct sockaddr *src, stru { struct syn_cache *sc; struct syn_cache_head *scp; - struct inpcb *inp, *oldinp; + struct socket *listenso; + struct inpcb *inp, *listeninp; struct tcpcb *tp = NULL; struct mbuf *am; - struct socket *oldso; u_int rtableid; NET_ASSERT_LOCKED(); @@ -3573,35 +3573,35 @@ syn_cache_get(struct sockaddr *src, stru * connection when the SYN arrived. If we can't create * the connection, abort it. */ - oldso = so; - oldinp = inp; - so = sonewconn(so, SS_ISCONNECTED, M_DONTWAIT); + listenso = so; + listeninp = inp; + so = sonewconn(listenso, SS_ISCONNECTED, M_DONTWAIT); if (so == NULL) goto resetandabort; soassertlocked(so); soref(so); inp = sotoinpcb(so); + tp = intotcpcb(inp); #ifdef IPSEC /* - * We need to copy the required security levels - * from the old pcb. Ditto for any other - * IPsec-related information. + * We need to copy the required security levels from the listen pcb. + * Ditto for any other IPsec-related information. */ - inp->inp_seclevel = oldinp->inp_seclevel; + inp->inp_seclevel = listeninp->inp_seclevel; #endif /* IPSEC */ #ifdef INET6 if (ISSET(inp->inp_flags, INP_IPV6)) { - KASSERT(ISSET(oldinp->inp_flags, INP_IPV6)); + KASSERT(ISSET(listeninp->inp_flags, INP_IPV6)); - inp->inp_ipv6.ip6_hlim = oldinp->inp_ipv6.ip6_hlim; - inp->inp_hops = oldinp->inp_hops; + inp->inp_ipv6.ip6_hlim = listeninp->inp_ipv6.ip6_hlim; + inp->inp_hops = listeninp->inp_hops; } else #endif { - KASSERT(!ISSET(oldinp->inp_flags, INP_IPV6)); + KASSERT(!ISSET(listeninp->inp_flags, INP_IPV6)); - inp->inp_ip.ip_ttl = oldinp->inp_ip.ip_ttl; + inp->inp_ip.ip_ttl = listeninp->inp_ip.ip_ttl; inp->inp_options = ip_srcroute(m); if (inp->inp_options == NULL) { inp->inp_options = sc->sc_ipopts; @@ -3639,8 +3639,7 @@ syn_cache_get(struct sockaddr *src, stru } (void) m_free(am); - tp = intotcpcb(inp); - tp->t_flags = sototcpcb(oldso)->t_flags & (TF_NOPUSH|TF_NODELAY); + tp->t_flags = intotcpcb(listeninp)->t_flags & (TF_NOPUSH|TF_NODELAY); if (sc->sc_request_r_scale != 15) { tp->requested_s_scale = sc->sc_requested_s_scale; tp->request_r_scale = sc->sc_request_r_scale;