From: David Gwynne <david@gwynne.id.au>
Subject: Re: vio: Enable multiqueue
To: Mark Kettenis <mark.kettenis@xs4all.nl>, Stefan Fritsch <sf@openbsd.org>
Cc: hrvoje@srce.hr, tech@openbsd.org
Date: Mon, 27 Jan 2025 14:41:06 +1000

On Sun, Jan 26, 2025 at 08:50:15PM +0100, Mark Kettenis wrote:
> > Date: Sun, 26 Jan 2025 19:47:04 +0100 (CET)
> > From: Stefan Fritsch <sf@openbsd.org>
> > 
> > On Sun, 26 Jan 2025, Stefan Fritsch wrote:
> > 
> > > Am 21.01.25 um 20:03 schrieb Hrvoje Popovski:
> > > > On 21.1.2025. 19:26, Stefan Fritsch wrote:
> > > > > Hi,
> > > > > 
> > > > > Am??16.01.25??um??19:19??schrieb??Hrvoje??Popovski:
> > > > > > > > > this??diff??finally??enables??multiqueue??for??vio(4).??It??goes??on??top??of??the
> > > > > > > > > "virtio:??Support??unused??virtqueues"??diff??from??my??previous??mail.
> > > > > > > > > 
> > > > > > > > > The??distribution??of??of??packets??to??the??enabled??queues??is??not??optimal.??To
> > > > > > > > > improve??this,??one??would??need??the??optional??RSS??(receive-side??scaling)
> > > > > > > > > feature??which??is??difficult??to??configure??with??libvirt/
> > > > > > > > > qemu??and??therefore
> > > > > > > > > usually??not??available??on??hypervisors.??Things??may??improve??with??future
> > > > > > > > > libvirt??versions.??RSS??support??is??not??included??in??this??diff.??But??even
> > > > > > > > > without??RSS,??we??have??seen??some??nice??performance??gains.
> > > > > > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > I'm??hitting??this??diff??with??forwarding??setup??over??ipsec??for??two??days??and
> > > > > > > > doing??ifconfig??up/
> > > > > > > > down??and??hosts??seems??stable.??Forwarding??performance??is
> > > > > > > > the??same??as??without??this??diff.
> > > > > > > > 
> > > > > > > > I'm??sending??traffic??from??host??connected??to??obsd1??vio2??then??that??traffic
> > > > > > > > goes??over??ipsec??link??between??obsd1??vio1??-??obsd2??vio1??and??traffic??exits
> > > > > > > > from??obsd2??vio3??to??other??host
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks??for??testing.??Since??the??traffic??distribution??is??done??heuristically
> > > > > > > by??the??hypervisor,??it??is??often??not??optimal.??I??think??it??is??particularily
> > > > > > > bad??for??your??case??because??the??hypervisor??will??think??that??all??ipsec
> > > > > > > traffic??belongs??to??one??flow??and??put??it??into??the??same??queue.
> > > > > > > 
> > > > > > > I??will??try??to??improve??it??a??bit,??but??in??general??things??get??better??if??you
> > > > > > > communicate??with??many??peers.
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > it??seems??that??even??with??plain??forwarding??all??traffic??on??egress
> > > > > > interfaces??are??going??to??one??queue.??On??ingress??interface??interrupts??are
> > > > > > spread??nicely.??Maybe??because??of??that??forwarding??performance??is??same??as
> > > > > > without??multiqueue??vio.
> > > > > 
> > > > > Thanks??again??for??the??testing.
> > > > > 
> > > > > I don't see this on my test setup. Could you please check the packet
> > > > > stats??in
> > > 
> > > One thing that is different in my setup is that I have pf enabled. If I
> > > disable pf, all packets go out on queue 0, too. This is due to the fact that
> > > we don't get a hash from the NIC that we can put into m_pkthdr.ph_flowid. pf
> > > will fill that in. If I enable pf on your setup, all outgoing queues are used.
> > > However the pktgen script with 254 source and 254 dst addresses requires
> > > around 129000 states. Increasing the limit in pf.conf leads to forwarding
> > > getting a bit faster (20%?) than with a single queue, though the rate varies
> > > quite a bit.
> > > 
> > > I need to think a bit more about this and how we could improve the situation.
> > 
> > With this diff and pf off, I get about twice the forwarding speed on your 
> > test setup (4 vcpus/4 queues).
> > 
> > To everyone: Is this something that could possibly be committed? In cases 
> > where the NIC gives us RSS hashes it should not change anything.
> 
> Doesn't does gratuitously reorder forwarded packets?

it does allow packets in "conversations" to be reordered, yes.

there's no guarantee that packets within a conversation are always
processed on a specific cpu, which is what allows packets to be
reordered. this is true for both softnet threads when forwarding,
and userland programs producing packets.

fwiw, flowids are not defined to be a toeplitz hash, they're best
effort set to something that hopes to spread packets out over things
like softnet threads, ports in lacp aggregations, and tx queues on
network cards. we try and make things that can do teoplitz use toeplitz,
but if that's not available then whatever is handy is still better than
nothing.

in this situation id have vio set the flowid to the rx queue id.

> > diff --git a/sys/net/ifq.c b/sys/net/ifq.c
> > index 7368aa50a57..547cfb26d84 100644
> > --- a/sys/net/ifq.c
> > +++ b/sys/net/ifq.c
> > @@ -903,6 +903,8 @@ priq_idx(unsigned int nqueues, const struct mbuf *m)
> >  
> >  	if (ISSET(m->m_pkthdr.csum_flags, M_FLOWID))
> >  		flow = m->m_pkthdr.ph_flowid;
> > +	else
> > +		flow = cpu_number();
> >  
> >  	return (flow % nqueues);
> >  }
> > 
>