Index | Thread | Search

From:
Vitaliy Makkoveev <mvs@openbsd.org>
Subject:
Re: Kill `inp_notify' list remains
To:
Alexander Bluhm <bluhm@openbsd.org>
Cc:
OpenBSD Tech <tech@openbsd.org>
Date:
Sat, 21 Dec 2024 00:32:06 +0300

Download raw body.

Thread

> On 20 Dec 2024, at 23:16, Alexander Bluhm <bluhm@openbsd.org> wrote:
> 
> On Fri, Dec 20, 2024 at 04:08:14AM +0300, Vitaliy Makkoveev wrote:
>> This was the list, where PCBs were temporary linked to avoid sleep with 
>> `inpt_mtx' mutex(9) held. in_pcbnotifyall() and in6_pcbnotify are the
>> last list users, so I propose to switch them to in_pcb_iterator() too,
>> moreover they already do in_pcb_is_iterator() check.
>> 
>> Note, in_pcb_iterator() does in_pcbref() and in_pcbunref() for
>> corresponding `inp', unlocked dereference is safe and no extra
>> reference release required.
>> 
>> Outside divert(4) sockets search loop, sysctl_file() is the only
>> `inp_queue' foreach loops runner. It takes exclusive netlock and holds
>> it while processing all inet PCB tables. I want to switch these loops to
>> the iterators and push shared solock into `inp_queue' loops.
>> 
>>> From this:
>> 
>> 	NET_LOCK();
>> 	mtx_enter(&tcbtable.inpt_mtx);
>> 	TAILQ_FOREACH(inp, &tcbtable.inpt_queue, inp_queue)
>> 	FILLSO(inp->inp_socket);
>> 	mtx_leave(&tcbtable.inpt_mtx);
>> 
>> to this:
>> 
>> 	mtx_enter(&tcbtable.inpt_mtx);
>> 	while ((inp = in_pcb_iterator(&tcbtable, inp, &iter)) != NULL) {
>> 		mtx_leave(&tcbtable.inpt_mtx);
>> 		NET_LOCK_SHARED();
>> 		if (inp->inp_socket)
>> 			FILLSO(inp->inp_socket);
>> 		NET_UNLOCK_SHARED();
>> 		mtx_enter(&tcbtable.inpt_mtx);
>> 	}
>> 	mtx_leave(&tcbtable.inpt_mtx);
>> 
>> In the future, net lock could be replaced with solock() and pushed down
>> to FILLSO(), so copyout() could be moved outside net lock too.
> 
> I would expect that we need per socket lock for the code above.
> Something has to gaurantee that socket does not change while sysctl
> is reading its values.  Shared net lock is not enough.

This is the hypothetic code. It is not yet clean to me, how to do
inp_socket dereference safe, at least for tcp sockets. This time
direct NET_LOCK_SHARED() is enough for that, so this example is
correct. Anyway, FILLSO() and fill_file() need more refactoring to
unlock them.