From: Stefan Fritsch Subject: Re: vio: Enable multiqueue To: David Gwynne Cc: Mark Kettenis , hrvoje@srce.hr, tech@openbsd.org Date: Mon, 27 Jan 2025 19:24:24 +0100 On Mon, 27 Jan 2025, David Gwynne wrote: > On Sun, Jan 26, 2025 at 08:50:15PM +0100, Mark Kettenis wrote: > > > Date: Sun, 26 Jan 2025 19:47:04 +0100 (CET) > > > From: Stefan Fritsch > > > > > > On Sun, 26 Jan 2025, Stefan Fritsch wrote: > > > > > > > One thing that is different in my setup is that I have pf enabled. If I > > > > disable pf, all packets go out on queue 0, too. This is due to the fact that > > > > we don't get a hash from the NIC that we can put into m_pkthdr.ph_flowid. pf > > > > will fill that in. If I enable pf on your setup, all outgoing queues are used. > > > > However the pktgen script with 254 source and 254 dst addresses requires > > > > around 129000 states. Increasing the limit in pf.conf leads to forwarding > > > > getting a bit faster (20%?) than with a single queue, though the rate varies > > > > quite a bit. > > > > > > > > I need to think a bit more about this and how we could improve the situation. > > > > > > With this diff and pf off, I get about twice the forwarding speed on your > > > test setup (4 vcpus/4 queues). > > > > > > To everyone: Is this something that could possibly be committed? In cases > > > where the NIC gives us RSS hashes it should not change anything. > > > > Doesn't does gratuitously reorder forwarded packets? > > it does allow packets in "conversations" to be reordered, yes. > > there's no guarantee that packets within a conversation are always > processed on a specific cpu, which is what allows packets to be > reordered. this is true for both softnet threads when forwarding, > and userland programs producing packets. > > fwiw, flowids are not defined to be a toeplitz hash, they're best > effort set to something that hopes to spread packets out over things > like softnet threads, ports in lacp aggregations, and tx queues on > network cards. we try and make things that can do teoplitz use toeplitz, > but if that's not available then whatever is handy is still better than > nothing. > > in this situation id have vio set the flowid to the rx queue id. The diff below does that. Do we expect any disadvantage if we set this simplistic flowid and pf will then not set the "properly" calculated one? Each of the diff below and Chris Cappuccio's ifq_pkt_hash() diff speed up udp forwarding without pf to 2-3 times single queue performance. But the variation is rather high, so I cannot determine if one diff is better. For udp with pf and for tcp forwarding, I cannot see a difference either. Is the diff below preferable over Chris's approach? diff --git a/sys/dev/pv/if_vio.c b/sys/dev/pv/if_vio.c index 1adf34d1cb6..01d42a38ce3 100644 --- a/sys/dev/pv/if_vio.c +++ b/sys/dev/pv/if_vio.c @@ -1514,6 +1514,10 @@ vio_rxeof(struct vio_queue *vioq) hdr = mtod(m, struct virtio_net_hdr *); m_adj(m, sc->sc_hdr_size); m0 = mlast = m; + if (virtio_has_feature(vsc, VIRTIO_NET_F_MQ)) { + m->m_pkthdr.ph_flowid = vioq->viq_ifiq->ifiq_idx; + SET(m->m_pkthdr.csum_flags, M_FLOWID); + } if (VIO_HAVE_MRG_RXBUF(sc)) bufs_left = hdr->num_buffers - 1; else