Download raw body.
vmd: add checksum offload for guests
On Wed, Jan 28, 2026 at 11:00:30AM -0500, Dave Voutila wrote: > David Gwynne <david@gwynne.id.au> writes: > > > On Sat, Jan 17, 2026 at 10:56:36PM +0100, Jan Klemkow wrote: > >> On Sat, Jan 17, 2026 at 11:38:50AM -0500, Dave Voutila wrote: > >> > Mike Larkin <mlarkin@nested.page> writes: > >> > > On Fri, Jan 16, 2026 at 07:38:16PM +0100, Jan Klemkow wrote: > >> > >> On Thu, Jan 15, 2026 at 02:08:43PM -0800, Mike Larkin wrote: > >> > >> > Does this "just work" no matter what guests I run? That's really all I care > >> > >> > about. > >> > >> > >> > >> Here is my current diff for checksum offloading in vmd(8). > >> > >> > >> > >> I tested the following combination of features: > >> > >> > >> > >> - Debian/Linux and OpenBSD-current guests > >> > >> - OpenBSD-current vio(4) w/o all offloading features > >> > >> - Linux, OpenBSD and Hostsystem via veb(4) and vlan(4) > >> > >> - IPv4 and IPv6 with tcpbench(1) > >> > >> - local interface locked lladdr > >> > >> - local interface dhcp > >> > >> > >> > >> Further tests are welcome! > >> > > > >> > > Not sure about dv@, but I can't really review this. it's hundreds of lines > >> > > of changes in vmd vionet that require a level of understanding of tap(4) and > >> > > in virtio/vionet (and the network stack in general) that I don't have. > >> > > When I did the original vionet in vmd years ago it was pretty straightforward > >> > > since the spec (for *all* virtio) was only like 20 pages. I was able to write > >> > > that code in a weekend. now that we have bolted on all this other stuff, I > >> > > don't feel comfortable giving oks in this area anymore since there is no way > >> > > I can look at this and know if it's right or not. I think you need a network > >> > > stack person to ok this, *and* explain what the ramifications are for vmd > >> > > in general. It looks like vmd is doing inspection of every packet now? I > >> > > dont think we want that. > >> > > >> > I've spent time digging into this and better understand it now. I'm also > >> > happy now with how the current diff isn't expanding pledges for vionet. > >> > > >> > It feels overkill to have to poke every packet, > > > > if vmd is going to provide virtio net to the guest, and you want to > > provide the offload features to the guest, then something on the > > host side has to implement the virtio net header and fiddle with > > the packets this way. > > > >> It don't have to be this way. My first versions of this diff was > >> without all this packet parsing stuff in vmd(8)[1]. I'll try reproduce > >> the old version till c2k25 to show you the difference. > >> > >> [1]: https://marc.info/?l=openbsd-tech&m=172381275602917 > > > > that's not the whole story though. if vmd doesn't do it, then the kernel > > has to. either way, the work has to be done, but i strongly believe > > that it's better to handle the virtio net header in vmd from a whole > > system perspective. > > > > i get that there's a desire to make vmd as thin and simple as > > possible, but that doesn't make sense to me if the kernel has to > > bear the cost of increased complexity and reduced flexibility > > instead. > > I'm ok with the design of putting the logic in vmd and agree with > keeping that muck out of the kernel. In the Kernel we already have ether_extract_headers() which already handles all the packet parsing edge cases. Its just about the question, if we use virtio_net_hdr as an interface or our own tun_hdr. For me its not about to shuffle complexity from kernel to userland, or the other way around. I'm interested to keep the whole complexity as small as possible. Give me some time, to show you the other version of the diff. > It seems like we need the TSO features to get performance gains. If > that's the case, I'd rather we look at TSO and consider this part of the > implementation. On its own, the checksum offload looks underwhelming. > > If TSO is going to bring another layer of complexity it's best that > surface now. :) The additional complexity for VIRTIO_NET_F_HOST_TSO is low. The whole diff is mostly about the basic infrastructure, to bring the information in virtio_net_hdr into the kernel and back. I'll try to put the VIRTIO_NET_F_HOST_TSO also inside the diff, to show you the result. > >> > but I do manage to see a > >> > small improvement in the one test I did using iperf3 sending from host > >> > to guest. It's only about 1-2% gain in throughput on my Intel x1c gen10 > >> > and less than 1% on my newer Ryzen AI 350 machine. (This was using a > >> > -current snapshot for the guest.) > >> > > >> > I did this both with the "local interface" (where we already inspect > >> > each packet to intercept DHCP packets) and one added to a veb(4) device > >> > with and accompanying host-side vport(4). > >> > > >> > My hypothesis is the gain is mostly due to offloading work from the > >> > single-vcpu guest to the host vionet tx or rx threads. > >> > > >> > Is it worth it? Especially knowing we're technically shortcutting the > >> > actual spec as written by attesting for every packet checksum being > >> > good? /shrug > >> > > >> > Does someone have a better benchmark showing this moves the needle? > >> > >> Its not worth it, to benchmark the checksum offloading here. I don't do > >> this diff for checksum offloading. This is just a dependency for LRO and > >> TSO in vmd(8) which is the real performance kicker. > >> > >> I already showed mlarkin@ at the h2k24 that TSO 10x the network > >> performance with a PoC diff for TSO in vmd(4). It pushed the guest to > >> host network performance from ~1 Gbit/s to ~10 Gbit/s back than. > >> > >> Thanks, > >> Jan
vmd: add checksum offload for guests