Download raw body.
Use WRITE_ONCE() to se set so_error in sosend()/soreceive()
> On 11 Jan 2025, at 16:19, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>
>> Date: Sat, 11 Jan 2025 14:15:09 +0100
>> From: Claudio Jeker <cjeker@diehard.n-r-g.com>
>>
>> On Fri, Jan 10, 2025 at 10:26:39PM +0100, Alexander Bluhm wrote:
>>> On Tue, Jan 07, 2025 at 04:00:31PM +0100, Martin Pieuchot wrote:
>>>> On 07/01/25(Tue) 13:57, Vitaliy Makkoveev wrote:
>>>>>
>>>>>
>>>>>> On 7 Jan 2025, at 13:45, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>>>
>>>>>>> From: Vitaliy Makkoveev <mvs@openbsd.org>
>>>>>>> Date: Mon, 6 Jan 2025 20:07:19 +0300
>>>>>>>
>>>>>>>> On 6 Jan 2025, at 19:45, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
>>>>>>>>
>>>>>>>>> Date: Mon, 6 Jan 2025 18:58:16 +0300
>>>>>>>>> From: Vitaliy Makkoveev <mvs@openbsd.org>
>>>>>>>>>
>>>>>>>>> This is the lockless access, so do it.
>>>>>>>>
>>>>>>>> Sorry, but once again, I don't think this makes sense. Atomic access
>>>>>>>> doesn't mean blindly replacing assignments with
>>>>>>>> READ_ONCE()/WRITE_ONCE() or atomic_load_int()/atomic_store_int(). The
>>>>>>>> diff below makes me strongly suspect accessing so_error without
>>>>>>>> holding some sort of lock isn't actually safe. At the very minimum
>>>>>>>> you're missing some memory barriers here!
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> This is not the blindly replacing, we need WRITE_ONCE()/READ_ONCE()
>>>>>>> here to prevent reordering and be sure, so_error value is not cached.
>>>>>>
>>>>>> But using READ_ONCE()/WRITE_ONCE() isn't enough to guarantee ordering
>>>>>> between threads. You need proper memory barriers for that.
>>>>>>
>>>>>
>>>>> I???m not interesting in order between threads here, so_error value
>>>>> could be lost even in the good old kernel locked or uniprocessor
>>>>> times. I only want it to be loaded or writed each do {} while()
>>>>> iteration, that???s the reason of *_ONCE() to exist.
>>>>
>>>> I don't understand why do we need READ_ONCE() to ensure the value is
>>>> loaded for every iteration of the loop. Isn't already the case? If
>>>> not do the other fields `sb_state', `so_state', etc also need it?
>>>>
>>>>> The synchronisation in this place made by dedicated locks.
>>>>>
>>>>>> Now the question is what are you ordering against? That isn't
>>>>>> immediately obvious to me; marking a variable or struct member as
>>>>>> "atomic" in the header file doesn't really help here...
>>>
>>> The locking strategy for so_error is not that clear.
>>>
>>> We do unlocked access here. We want to avoid extra mutex just to
>>> transfer the integer variable so_error to userland. I personally
>>> oppose accessing variables without any protections when used in
>>> more than one thread. READ_ONCE()/WRITE_ONCE() is the bare minumum
>>> to avoid compiler misoptimizations. So the code with the diff is
>>> better than before. The question is, is it enough or does it look
>>> like false safety?
>>>
>>> Code like this looks odd:
>>> if ((error = READ_ONCE(so->so_error))) {
>>> WRITE_ONCE(so->so_error, 0);
>>> I would at least expect an atomic swap operation. Userland threads
>>> are no problem, they are locked, but the softnet thread can modify
>>> so->so_error in between. This was not the case in the old days,
>>> BSD4.4 had splnet() around this code.
>>>
>>> So userland may see only one error, when two softnet threads write
>>> two errors. But this can always happen as softnet threads just
>>> write to the variable. Userland sees at least one error, and that
>>> is important.
>>>
>>> Next question is, do we need memory barriers? The code does several
>>> checks and returns different errors to userland. So in theory the
>>> values stored in so->so_error and so->so_state & SS_ISCONNECTED
>>> could be reordered. The userland could miss one error and do the
>>> SS_ISCONNECTED handling instead. But again, does it matter when
>>> the result of a TCP FIN is processed before an error from an ICMP
>>> packet? Such events are not really synchronized by the stack.
>>>
>>> So I would say the proposed chunk is ok.
>>>
>>> Same arguments for the MSG_PEEK chunk.
>>>
>>> Next question was whether READ_ONCE() is necessary to read so->so_error
>>> in each loop iteration. In my opinion READ_ONCE() is more important
>>> than WRITE_ONCE(). An compiler can optimize away read operations
>>> from loops. In this case sbwait() acts as a compiler barrier. But
>>> I prefer explicit READ_ONCE() in this case. Better than guessing
>>> what the compiler might do.
>>>
>>> OK bluhm@ for the READ_ONCE() chunk.
>>>
>>> I am also in favor for the WRITE_ONCE() chunks, but there are other
>>> opinions. We should find a consensus how to handle such integer
>>> access.
>>
>> I don't understand this discussion. Is so_error a real atomic value or
>> does it depend on other state changes as well?
>>
>> When I look at the code e.g. tcp_drop() sets so_errno but also closes the
>> connection. All of this should happen in an atomic fashion not just
>> assigning errno to so->so_error.
>>
>> Also doing a READ_ONCE() followed by a WRITE_ONCE() is not an atomic
>> operation. So I really think this may lure people into thinking that
>> this is safe while in reality it is not.
>>
>> Skipping proper locking because it may be expensive is a very bad choice.
>> We really should make sure the code is correct even if that requires
>> additional locks (or proper understanding of which locks protect what).
>
> Thanks Claudio. That is exactly the point I'm trying to make.
What about to understand how locking works in this area before
make any conclusions?
Don’t worry, I don't want to pushing this diff anymore.
Use WRITE_ONCE() to se set so_error in sosend()/soreceive()