From: Alexander Bluhm <bluhm@openbsd.org>
Subject: Re: Don't take solock() in sosend() and soreceive() paths for unix(4) sockets
To: Vitaliy Makkoveev <mvs@openbsd.org>
Cc: tech@openbsd.org
Date: Fri, 3 May 2024 00:26:32 +0200

On Wed, May 01, 2024 at 01:01:22AM +0300, Vitaliy Makkoveev wrote:
> On Tue, Apr 30, 2024 at 11:01:04PM +0300, Vitaliy Makkoveev wrote:
> > Use `sb_mtx' mutex(9) and `sb_lock' rwlock(9) to protect `so_snd' and
> > `so_rcv' of unix(4) sockets.
> > 
> > The transmission of unix(4) sockets already half-unlocked because
> > connected peer is not locked by solock() during sbappend*() call. Since
> > the `so_snd' is protected by `sb_mtx' mutex(9) the re-locking is not
> > required in uipc_rcvd() too.
> > 
> > SB_OWNLOCK became redundant with SB_MTXLOCK, so remove it. SB_MTXLOCK
> > was kept because checks against SB_MTXLOCK within sb*() routines look
> > more consistent to me.
> > 
> > Please note, the unlocked peer `so2' of unix(4) can't be disconnected
> > while solock() is held on `so'. That's why unlocked sorwakeup() and
> > sowwakeup() are fine, corresponding paths will never be followed.
> > 
> 
> Add forgotten fifofs chunks. Against previous, this diff does direct
> `so_rcv' dispose and cleanup in sofree(). This sockets is almost dead
> and unlinked from everywhere include spliced peer. So concurrent
> sotask() thread will just exit. This required to solve inode and so_rcv
> locks ordering. Also this removes re-locking from sofree() for all
> sockets.

Passed regress with witness.

One question, otherwise OK bluhm@

> @@ -509,10 +513,6 @@ uipc_send(struct socket *so, struct mbuf
>  			goto out;
>  	}
>  
> -	if (so->so_snd.sb_state & SS_CANTSENDMORE) {
> -		error = EPIPE;
> -		goto dispose;
> -	}
>  	if (unp->unp_conn == NULL) {
>  		error = ENOTCONN;
>  		goto dispose;
> @@ -525,11 +525,23 @@ uipc_send(struct socket *so, struct mbuf
>  	 * send buffer counts to maintain backpressure.
>  	 * Wake up readers.
>  	 */
> +	/*
> +	 * sbappend*() should be serialized together
> +	 * with so_snd modification.
> +	 */
>  	mtx_enter(&so2->so_rcv.sb_mtx);
> +	mtx_enter(&so->so_snd.sb_mtx);
> +	if (so->so_snd.sb_state & SS_CANTSENDMORE) {
> +		mtx_leave(&so->so_snd.sb_mtx);
> +		mtx_leave(&so2->so_rcv.sb_mtx);
> +		error = EPIPE;
> +		goto dispose;
> +	}
>  	if (control) {
>  		if (sbappendcontrol(so2, &so2->so_rcv, m, control)) {
>  			control = NULL;
>  		} else {
> +			mtx_leave(&so->so_snd.sb_mtx);
>  			mtx_leave(&so2->so_rcv.sb_mtx);
>  			error = ENOBUFS;
>  			goto dispose;

You change the order of EPIPE and ENOTCONN error checks here.
The difference is relevant, EPIPE may signal a SIGPIPE.

When writing into a socket that stops working, the error should be
EPIPE.  If there are two processes, A is writing to B, and B
disconnects the socket, then A should get a EPIPE.

Do you change the behavior?

In sosend() we have a SS_CANTSENDMORE check with EPIPE error.
So maybe the uipc_send() errors are irrelevant.

bluhm