Index | Thread | Search

From:
Claudio Jeker <cjeker@diehard.n-r-g.com>
Subject:
Re: [patch 2 of 3] iscsid(8): avoid double free & race during cleanup
To:
Jack Burton <jack@saosce.com.au>
Cc:
tech@openbsd.org
Date:
Wed, 15 Jan 2025 14:46:02 +0100

Download raw body.

Thread
On Tue, Dec 31, 2024 at 06:33:08PM +1030, Jack Burton wrote:
> Avoid double free in two places.  Both occur only during cleanup after
> receiving SIGTERM (which explains why they've gone unnoticed for so
> long: I only noticed them while testing my next patch, but I've
> confirmed that they do occur even with a pristine current tree).  To
> trigger the first bug, either: (a) at least one connection needs to
> have failed before receiving SIGTERM; or (b) at least one connection in
> full-feature phase needs to be to a target *other than* the netbsd
> iscsi target from ports (no idea why that one seems to be immune).
> 
> pdu_readbuf_free() had a fairly obvious double free.  It's called
> (at least) twice during session shutdown.  The fix was also obvious.

This one is strange. pdu_readbuf_free() is only called by conn_free()
which removes and frees the connection. So a second call to
pdu_readbuf_free with the smae prbuf is only possible as a use after free.

I wonder if the 2nd bug is actually the cause for this one and fixing that
one fixes this as well.
 
> Fixing that one unmasked a latent race condition during shutdown.
> Depending on whether session reestablishment attempts result in any
> new connections entering the login phase before ISCSI_EXIT_WAIT
> expires, EITHER conn_free() would get called multiple times on the
> same connection OR there'd be an outright access after free.
> 
> To avoid adding extra states to the fsm, simply check exit_rounds
> (which gets set in shutdown_cb) before freeing connections in
> sess_do_free(): if it's set then we're about to exit anyway, so it
> doesn't matter that some connections may get reallocated without
> freeing the old ones.

Uhm. I don't understand how conn_free() can be called multiple times.
It means that the s->connections TAILQ is corrupted in some ways.

One issue is that sess_do_reinstatement() does a double TAILQ_REMOVE().
So maybe that is the root cause. Could you give the below diff a try?

I'm not sure if the code should retrigger shutdowns or not in the
shutdown_cb.

Also something that is strange is that sess_do_free is actually not
freeing the session. Another thing to look into (its been a long time I
looked at the iscsi RFC).

> With this patch applied, iscsid(8) no longer crashes during cleanup
> after receiving SIGTERM, regardless what targets it's connected to or
> whether those connections are running or failed at the time.
> 
> This patch is independent of the other two in the set.

-- 
:wq Claudio

Index: session.c
===================================================================
RCS file: /cvs/src/usr.sbin/iscsid/session.c,v
diff -u -p -r1.9 session.c
--- session.c	8 Mar 2023 04:43:13 -0000	1.9
+++ session.c	15 Jan 2025 13:40:50 -0000
@@ -430,7 +430,6 @@ sess_do_reinstatement(struct session *s,
 
 		if (c->state & CONN_FAILED) {
 			conn_fsm(c, CONN_EV_FREE);
-			TAILQ_REMOVE(&s->connections, c, entry);
 			conn_free(c);
 		}
 	}