Index | Thread | Search

From:
Vitaliy Makkoveev <mvs@openbsd.org>
Subject:
Re: Use WRITE_ONCE() to se set so_error in sosend()/soreceive()
To:
Claudio Jeker <cjeker@diehard.n-r-g.com>
Cc:
Alexander Bluhm <bluhm@openbsd.org>, Mark Kettenis <mark.kettenis@xs4all.nl>, OpenBSD tech <tech@openbsd.org>
Date:
Sat, 11 Jan 2025 18:00:56 +0300

Download raw body.

Thread
> On 11 Jan 2025, at 16:15, Claudio Jeker <cjeker@diehard.n-r-g.com> wrote:
> 
> On Fri, Jan 10, 2025 at 10:26:39PM +0100, Alexander Bluhm wrote:
>> On Tue, Jan 07, 2025 at 04:00:31PM +0100, Martin Pieuchot wrote:
>>> On 07/01/25(Tue) 13:57, Vitaliy Makkoveev wrote:
>>>> 
>>>> 
>>>>> On 7 Jan 2025, at 13:45, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>> 
>>>>>> From: Vitaliy Makkoveev <mvs@openbsd.org>
>>>>>> Date: Mon, 6 Jan 2025 20:07:19 +0300
>>>>>> 
>>>>>>> On 6 Jan 2025, at 19:45, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>>>> 
>>>>>>>> Date: Mon, 6 Jan 2025 18:58:16 +0300
>>>>>>>> From: Vitaliy Makkoveev <mvs@openbsd.org>
>>>>>>>> 
>>>>>>>> This is the lockless access, so do it.
>>>>>>> 
>>>>>>> Sorry, but once again, I don't think this makes sense.  Atomic access
>>>>>>> doesn't mean blindly replacing assignments with
>>>>>>> READ_ONCE()/WRITE_ONCE() or atomic_load_int()/atomic_store_int().  The
>>>>>>> diff below makes me strongly suspect accessing so_error without
>>>>>>> holding some sort of lock isn't actually safe.  At the very minimum
>>>>>>> you're missing some memory barriers here!
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> This is not the blindly replacing, we need WRITE_ONCE()/READ_ONCE()
>>>>>> here to prevent reordering and be sure, so_error value is not cached.
>>>>> 
>>>>> But using READ_ONCE()/WRITE_ONCE() isn't enough to guarantee ordering
>>>>> between threads.  You need proper memory barriers for that.
>>>>> 
>>>> 
>>>> I???m not interesting in order between threads here, so_error value
>>>> could be lost even in the good old kernel locked or uniprocessor
>>>> times. I only want it to be loaded or writed each do {} while()
>>>> iteration, that???s the reason of *_ONCE() to exist.
>>> 
>>> I don't understand why do we need READ_ONCE() to ensure the value is
>>> loaded for every iteration of the loop.  Isn't already the case?  If
>>> not do the other fields `sb_state', `so_state', etc also need it?
>>> 
>>>> The synchronisation in this place made by dedicated locks.
>>>> 
>>>>> Now the question is what are you ordering against?  That isn't
>>>>> immediately obvious to me; marking a variable or struct member as
>>>>> "atomic" in the header file doesn't really help here...
>> 
>> The locking strategy for so_error is not that clear.
>> 
>> We do unlocked access here.  We want to avoid extra mutex just to
>> transfer the integer variable so_error to userland.  I personally
>> oppose accessing variables without any protections when used in
>> more than one thread.  READ_ONCE()/WRITE_ONCE() is the bare minumum
>> to avoid compiler misoptimizations.  So the code with the diff is
>> better than before.  The question is, is it enough or does it look
>> like false safety?
>> 
>> Code like this looks odd:
>> 		if ((error = READ_ONCE(so->so_error))) {
>> 			WRITE_ONCE(so->so_error, 0);
>> I would at least expect an atomic swap operation.  Userland threads
>> are no problem, they are locked, but the softnet thread can modify
>> so->so_error in between.  This was not the case in the old days,
>> BSD4.4 had splnet() around this code.
>> 
>> So userland may see only one error, when two softnet threads write
>> two errors.  But this can always happen as softnet threads just
>> write to the variable.  Userland sees at least one error, and that
>> is important.
>> 
>> Next question is, do we need memory barriers?  The code does several
>> checks and returns different errors to userland.  So in theory the
>> values stored in so->so_error and so->so_state & SS_ISCONNECTED
>> could be reordered.  The userland could miss one error and do the
>> SS_ISCONNECTED handling instead.  But again, does it matter when
>> the result of a TCP FIN is processed before an error from an ICMP
>> packet?  Such events are not really synchronized by the stack.
>> 
>> So I would say the proposed chunk is ok.
>> 
>> Same arguments for the MSG_PEEK chunk.
>> 
>> Next question was whether READ_ONCE() is necessary to read so->so_error
>> in each loop iteration.  In my opinion READ_ONCE() is more important
>> than WRITE_ONCE().  An compiler can optimize away read operations
>> from loops.  In this case sbwait() acts as a compiler barrier.  But
>> I prefer explicit READ_ONCE() in this case.  Better than guessing
>> what the compiler might do.
>> 
>> OK bluhm@ for the READ_ONCE() chunk.
>> 
>> I am also in favor for the WRITE_ONCE() chunks, but there are other
>> opinions.  We should find a consensus how to handle such integer
>> access.
> 
> I don't understand this discussion. Is so_error a real atomic value or
> does it depend on other state changes as well?
> 
> When I look at the code e.g. tcp_drop() sets so_errno but also closes the
> connection. All of this should happen in an atomic fashion not just
> assigning errno to so->so_error.

This atomicity depends on the the path, because now send, receive
and state path each relies on it’s own lock. The transmission paths
rely on socket buffer locks and the socket buffer states.

Note, even if the only lock protects the whole layers, only one path
reads and resets so_error. You can’t predict which one will be the
winner.

> 
> Also doing a READ_ONCE() followed by a WRITE_ONCE() is not an atomic
> operation. So I really think this may lure people into thinking that
> this is safe while in reality it is not.
> 

I do NOT want atomicity in this path. This path relies on sb_state,
not so_error state. Note, this patch is sb_mtx locked, and the
soisdisconnected() needs to take sb_mtx before modify sb_state
or so_state.

> Skipping proper locking because it may be expensive is a very bad choice.
> We really should make sure the code is correct even if that requires
> additional locks (or proper understanding of which locks protect what).

This path was unlocked many times ago. Do you have any problems?