From: Claudio Jeker Subject: Re: [patch 2 of 3] iscsid(8): avoid double free & race during cleanup To: Jack Burton Cc: tech@openbsd.org Date: Wed, 15 Jan 2025 14:46:02 +0100 On Tue, Dec 31, 2024 at 06:33:08PM +1030, Jack Burton wrote: > Avoid double free in two places. Both occur only during cleanup after > receiving SIGTERM (which explains why they've gone unnoticed for so > long: I only noticed them while testing my next patch, but I've > confirmed that they do occur even with a pristine current tree). To > trigger the first bug, either: (a) at least one connection needs to > have failed before receiving SIGTERM; or (b) at least one connection in > full-feature phase needs to be to a target *other than* the netbsd > iscsi target from ports (no idea why that one seems to be immune). > > pdu_readbuf_free() had a fairly obvious double free. It's called > (at least) twice during session shutdown. The fix was also obvious. This one is strange. pdu_readbuf_free() is only called by conn_free() which removes and frees the connection. So a second call to pdu_readbuf_free with the smae prbuf is only possible as a use after free. I wonder if the 2nd bug is actually the cause for this one and fixing that one fixes this as well. > Fixing that one unmasked a latent race condition during shutdown. > Depending on whether session reestablishment attempts result in any > new connections entering the login phase before ISCSI_EXIT_WAIT > expires, EITHER conn_free() would get called multiple times on the > same connection OR there'd be an outright access after free. > > To avoid adding extra states to the fsm, simply check exit_rounds > (which gets set in shutdown_cb) before freeing connections in > sess_do_free(): if it's set then we're about to exit anyway, so it > doesn't matter that some connections may get reallocated without > freeing the old ones. Uhm. I don't understand how conn_free() can be called multiple times. It means that the s->connections TAILQ is corrupted in some ways. One issue is that sess_do_reinstatement() does a double TAILQ_REMOVE(). So maybe that is the root cause. Could you give the below diff a try? I'm not sure if the code should retrigger shutdowns or not in the shutdown_cb. Also something that is strange is that sess_do_free is actually not freeing the session. Another thing to look into (its been a long time I looked at the iscsi RFC). > With this patch applied, iscsid(8) no longer crashes during cleanup > after receiving SIGTERM, regardless what targets it's connected to or > whether those connections are running or failed at the time. > > This patch is independent of the other two in the set. -- :wq Claudio Index: session.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/session.c,v diff -u -p -r1.9 session.c --- session.c 8 Mar 2023 04:43:13 -0000 1.9 +++ session.c 15 Jan 2025 13:40:50 -0000 @@ -430,7 +430,6 @@ sess_do_reinstatement(struct session *s, if (c->state & CONN_FAILED) { conn_fsm(c, CONN_EV_FREE); - TAILQ_REMOVE(&s->connections, c, entry); conn_free(c); } }