Mailing List Archive

From:: Alexander Bluhm <alexander.bluhm@gmx.net>
Subject:: Re: pf inpcb link mutex
To:: Alexandr Nedvedicky <sashan@fastmail.net>
Cc:: tech@openbsd.org
Date:: Mon, 1 Jan 2024 22:22:37 +0100

Download raw body.

Thread

2024-01-01 19:48 Alexandr Nedvedicky:
pf inpcb link mutex
- 2024-01-01 20:08 Vitaliy Makkoveev:
  pf inpcb link mutex
- 2024-01-01 21:22 Alexander Bluhm:
  pf inpcb link mutex

On Mon, Jan 01, 2024 at 08:48:15PM +0100, Alexandr Nedvedicky wrote:
> Hello,
> 
> On Mon, Jan 01, 2024 at 08:26:00PM +0100, Alexander Bluhm wrote:
> > On Mon, Jan 01, 2024 at 07:10:37PM +0100, Alexandr Nedvedicky wrote:
> > >     in other words: can we use inpcb::inp_mtx in your diff? this way we
> > >     can avoid introduction of global mutex.
> > 
> > This was also my first attempt.  Below is an earlier version of the
> > diff.  It has races that I could not fix.
> > 
> > pf_state_key_unlink_inpcb() dereferences inp = sk->sk_inp.  There
> > is no guarantee that another CPU does not change sk_inp.  This code
> > is not MP safe as the mutex protects the life time of inp.
> > +       if (inp == NULL)
> > +               return;
> > +       mtx_enter(&inp->inp_mtx);
> > 
> > Another problem is in pf_inp_lookup().  The inp can be freed between
> > READ_ONCE() and in_pcbref().
> >         if (!pf_state_key_isvalid(sk))
> >                 pf_mbuf_unlink_state_key(m);
> >         else
> > +               inp = READ_ONCE(sk->sk_inp);
> > 
> >         in_pcbref(inp);
> > 
> > A global mutex protects inp = sk->sk_inp.  So I changed it.  Note
> > that pf_inp_mtx covers only very short sections of code.  I do not
> > expect much lock contention.
> > 
> 
>     thanks for clarification. I understand now. I'm fine with using a
>     a global mutex in pf(4). it will allow you to get progress.
> 
>     I think for a long run we will need to introduce reference counting
>     for inp instances. so every packet/mbuf and state key will
>     get a reference. I think we will need similar mechanism to
>     what we have for state keys in pf(4) now.

We refcount inp in pf state key, but we don't do it in mbuf.  Idea
is that protocol layer is holding inp while mbuf is passed down to
pf.  So m->m_pkthdr.pf.inp is accessed without lock or refcount.

>     feel free to go ahead with pf_inp_mtx diff. this is OK by sashan

Thanks

bluhm

2024-01-01 19:48 Alexandr Nedvedicky:
pf inpcb link mutex
- 2024-01-01 20:08 Vitaliy Makkoveev:
  pf inpcb link mutex
- 2024-01-01 21:22 Alexander Bluhm:
  pf inpcb link mutex