From: Vitaliy Makkoveev Subject: Re: Use WRITE_ONCE() to se set so_error in sosend()/soreceive() To: Mark Kettenis Cc: Claudio Jeker , bluhm@openbsd.org, tech@openbsd.org Date: Sat, 11 Jan 2025 18:25:11 +0300 > On 11 Jan 2025, at 16:19, Mark Kettenis wrote: > >> Date: Sat, 11 Jan 2025 14:15:09 +0100 >> From: Claudio Jeker >> >> On Fri, Jan 10, 2025 at 10:26:39PM +0100, Alexander Bluhm wrote: >>> On Tue, Jan 07, 2025 at 04:00:31PM +0100, Martin Pieuchot wrote: >>>> On 07/01/25(Tue) 13:57, Vitaliy Makkoveev wrote: >>>>> >>>>> >>>>>> On 7 Jan 2025, at 13:45, Mark Kettenis wrote: >>>>>> >>>>>>> From: Vitaliy Makkoveev >>>>>>> Date: Mon, 6 Jan 2025 20:07:19 +0300 >>>>>>> >>>>>>>> On 6 Jan 2025, at 19:45, Mark Kettenis wrote: >>>>>>>> >>>>>>>>> Date: Mon, 6 Jan 2025 18:58:16 +0300 >>>>>>>>> From: Vitaliy Makkoveev >>>>>>>>> >>>>>>>>> This is the lockless access, so do it. >>>>>>>> >>>>>>>> Sorry, but once again, I don't think this makes sense. Atomic access >>>>>>>> doesn't mean blindly replacing assignments with >>>>>>>> READ_ONCE()/WRITE_ONCE() or atomic_load_int()/atomic_store_int(). The >>>>>>>> diff below makes me strongly suspect accessing so_error without >>>>>>>> holding some sort of lock isn't actually safe. At the very minimum >>>>>>>> you're missing some memory barriers here! >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> This is not the blindly replacing, we need WRITE_ONCE()/READ_ONCE() >>>>>>> here to prevent reordering and be sure, so_error value is not cached. >>>>>> >>>>>> But using READ_ONCE()/WRITE_ONCE() isn't enough to guarantee ordering >>>>>> between threads. You need proper memory barriers for that. >>>>>> >>>>> >>>>> I???m not interesting in order between threads here, so_error value >>>>> could be lost even in the good old kernel locked or uniprocessor >>>>> times. I only want it to be loaded or writed each do {} while() >>>>> iteration, that???s the reason of *_ONCE() to exist. >>>> >>>> I don't understand why do we need READ_ONCE() to ensure the value is >>>> loaded for every iteration of the loop. Isn't already the case? If >>>> not do the other fields `sb_state', `so_state', etc also need it? >>>> >>>>> The synchronisation in this place made by dedicated locks. >>>>> >>>>>> Now the question is what are you ordering against? That isn't >>>>>> immediately obvious to me; marking a variable or struct member as >>>>>> "atomic" in the header file doesn't really help here... >>> >>> The locking strategy for so_error is not that clear. >>> >>> We do unlocked access here. We want to avoid extra mutex just to >>> transfer the integer variable so_error to userland. I personally >>> oppose accessing variables without any protections when used in >>> more than one thread. READ_ONCE()/WRITE_ONCE() is the bare minumum >>> to avoid compiler misoptimizations. So the code with the diff is >>> better than before. The question is, is it enough or does it look >>> like false safety? >>> >>> Code like this looks odd: >>> if ((error = READ_ONCE(so->so_error))) { >>> WRITE_ONCE(so->so_error, 0); >>> I would at least expect an atomic swap operation. Userland threads >>> are no problem, they are locked, but the softnet thread can modify >>> so->so_error in between. This was not the case in the old days, >>> BSD4.4 had splnet() around this code. >>> >>> So userland may see only one error, when two softnet threads write >>> two errors. But this can always happen as softnet threads just >>> write to the variable. Userland sees at least one error, and that >>> is important. >>> >>> Next question is, do we need memory barriers? The code does several >>> checks and returns different errors to userland. So in theory the >>> values stored in so->so_error and so->so_state & SS_ISCONNECTED >>> could be reordered. The userland could miss one error and do the >>> SS_ISCONNECTED handling instead. But again, does it matter when >>> the result of a TCP FIN is processed before an error from an ICMP >>> packet? Such events are not really synchronized by the stack. >>> >>> So I would say the proposed chunk is ok. >>> >>> Same arguments for the MSG_PEEK chunk. >>> >>> Next question was whether READ_ONCE() is necessary to read so->so_error >>> in each loop iteration. In my opinion READ_ONCE() is more important >>> than WRITE_ONCE(). An compiler can optimize away read operations >>> from loops. In this case sbwait() acts as a compiler barrier. But >>> I prefer explicit READ_ONCE() in this case. Better than guessing >>> what the compiler might do. >>> >>> OK bluhm@ for the READ_ONCE() chunk. >>> >>> I am also in favor for the WRITE_ONCE() chunks, but there are other >>> opinions. We should find a consensus how to handle such integer >>> access. >> >> I don't understand this discussion. Is so_error a real atomic value or >> does it depend on other state changes as well? >> >> When I look at the code e.g. tcp_drop() sets so_errno but also closes the >> connection. All of this should happen in an atomic fashion not just >> assigning errno to so->so_error. >> >> Also doing a READ_ONCE() followed by a WRITE_ONCE() is not an atomic >> operation. So I really think this may lure people into thinking that >> this is safe while in reality it is not. >> >> Skipping proper locking because it may be expensive is a very bad choice. >> We really should make sure the code is correct even if that requires >> additional locks (or proper understanding of which locks protect what). > > Thanks Claudio. That is exactly the point I'm trying to make. What about to understand how locking works in this area before make any conclusions? Don’t worry, I don't want to pushing this diff anymore.