Mailing List Archive

On Wed, May 01, 2024 at 01:16:37PM +0200, Mark Kettenis wrote: > Well, you did exactly what I tried to discourage you from doing. I > don't think this is "mpsafe" at all. Let me explain... I think we have a different understanding how lockless access should work. > > + /* copyin() may sleep, call it first */ > > This comment will become rather misleading once we drop the vslock() > for certain sysctls like mvs@ is suggesting; copyout(9) may sleep as > well at that point. I doubt that sleeping copyout(9) in sysctl(2) is a step in the right direction. Getting consistent information will be harder, you will fall into various NFS traps, and you cannot hold mutex when copying kernel data to userland. > What does "mpsafe" mean here? The READ_ONCE() and WRITE_ONCE() here > don't really achieve anything and from a user standpoint this could > still yield inconsistent values, where two threads changing the sysctl > value concurrently could return any "old" value. And by that I mean > that if the value of the sysctl is initailly A, thread 1 changes it to > B and thread 2 changes it to C, the final value may end up as C, but > thread 2 might return either A or B as the old value. A truly > "mpsafe" version should probably use atomic_swap_uint(9). Otherwise > you might just as wel just remove the locking around the existing > sysctl_bounded_arr() call. "mpsafe" with READ_ONCE() and WRITE_ONCE() means the compiler does exactly one memory access. When accessing one integer this guarantee may be enough. If there are more than one variable or multiple memory locations involved, the surrounding code needs barriers or locks. Basically the diff checks that the IP code has no such variables depending on each other. Otherwise it reads them once and passes the value along in register or stack. Multiple sysctl reads in thead A B C are serialized by the rwlock sysctl_lock. In my previous diff a added a comment there. Actually I was concerned about it, tried fix to it with a mutex, until I discovered it cannon happen due to sysctl_lock. Just removing the net lock around sysctl_bounded_arr() call feels wrong. The bare minimum for integer access without lock is the READ_ONCE() macro. I use it to mark lockless reads where another thread may modify the value. Of course when using it, you have to be careful that the surrounding code actually uses it only once and not in correlation with some other value. Currently we have the big hammer net lock. A mutex for changing a single integer is overkill. Atomic read and store could be implemented, but is not necessary. Reading values that another CPU may modify without volatile is wrong. Otherwise the compile may read multiple times or cache the value. So READ_ONCE() and WRITE_ONCE() seems the way to go. How would you implement this? bluhm

2024-04-30 18:46 Alexander Bluhm:
ip sysctl atomic
- 2024-05-01 11:16 Mark Kettenis:
  ip sysctl atomic
- - 2024-05-02 11:21 Alexander Bluhm:
    ip sysctl atomic
  - - 2024-05-04 20:22 Mark Kettenis:
      ip sysctl atomic
    - - 2024-05-04 20:34 Theo de Raadt:
        ip sysctl atomic