From: Jan Klemkow Subject: Re: vmd: add checksum offload for guests To: Dave Voutila Cc: David Gwynne , Mike Larkin , Klemens Nanni , Alexander Bluhm , tech@openbsd.org Date: Sat, 7 Feb 2026 00:28:05 +0100 On Wed, Jan 28, 2026 at 11:00:30AM -0500, Dave Voutila wrote: > David Gwynne writes: > > > On Sat, Jan 17, 2026 at 10:56:36PM +0100, Jan Klemkow wrote: > >> On Sat, Jan 17, 2026 at 11:38:50AM -0500, Dave Voutila wrote: > >> > Mike Larkin writes: > >> > > On Fri, Jan 16, 2026 at 07:38:16PM +0100, Jan Klemkow wrote: > >> > >> On Thu, Jan 15, 2026 at 02:08:43PM -0800, Mike Larkin wrote: > >> > >> > Does this "just work" no matter what guests I run? That's really all I care > >> > >> > about. > >> > >> > >> > >> Here is my current diff for checksum offloading in vmd(8). > >> > >> > >> > >> I tested the following combination of features: > >> > >> > >> > >> - Debian/Linux and OpenBSD-current guests > >> > >> - OpenBSD-current vio(4) w/o all offloading features > >> > >> - Linux, OpenBSD and Hostsystem via veb(4) and vlan(4) > >> > >> - IPv4 and IPv6 with tcpbench(1) > >> > >> - local interface locked lladdr > >> > >> - local interface dhcp > >> > >> > >> > >> Further tests are welcome! > >> > > > >> > > Not sure about dv@, but I can't really review this. it's hundreds of lines > >> > > of changes in vmd vionet that require a level of understanding of tap(4) and > >> > > in virtio/vionet (and the network stack in general) that I don't have. > >> > > When I did the original vionet in vmd years ago it was pretty straightforward > >> > > since the spec (for *all* virtio) was only like 20 pages. I was able to write > >> > > that code in a weekend. now that we have bolted on all this other stuff, I > >> > > don't feel comfortable giving oks in this area anymore since there is no way > >> > > I can look at this and know if it's right or not. I think you need a network > >> > > stack person to ok this, *and* explain what the ramifications are for vmd > >> > > in general. It looks like vmd is doing inspection of every packet now? I > >> > > dont think we want that. > >> > > >> > I've spent time digging into this and better understand it now. I'm also > >> > happy now with how the current diff isn't expanding pledges for vionet. > >> > > >> > It feels overkill to have to poke every packet, > > > > if vmd is going to provide virtio net to the guest, and you want to > > provide the offload features to the guest, then something on the > > host side has to implement the virtio net header and fiddle with > > the packets this way. > > > >> It don't have to be this way. My first versions of this diff was > >> without all this packet parsing stuff in vmd(8)[1]. I'll try reproduce > >> the old version till c2k25 to show you the difference. > >> > >> [1]: https://marc.info/?l=openbsd-tech&m=172381275602917 > > > > that's not the whole story though. if vmd doesn't do it, then the kernel > > has to. either way, the work has to be done, but i strongly believe > > that it's better to handle the virtio net header in vmd from a whole > > system perspective. > > > > i get that there's a desire to make vmd as thin and simple as > > possible, but that doesn't make sense to me if the kernel has to > > bear the cost of increased complexity and reduced flexibility > > instead. > > I'm ok with the design of putting the logic in vmd and agree with > keeping that muck out of the kernel. > > It seems like we need the TSO features to get performance gains. If > that's the case, I'd rather we look at TSO and consider this part of the > implementation. On its own, the checksum offload looks underwhelming. > > If TSO is going to bring another layer of complexity it's best that > surface now. :) This diff brings TSO to vmd(8) and tun(4) depend on my last checksum diff a few minutes ago. It improves the guest to host network performance. In my setup it bumps it form 1.4 Gbit/s to 10/25 Gbit/s. With local interfaces I got 25 Gbit/s and 10 Gbit/s without local interfaces in my vm.conf(5). The will not work properly with bridges/veb at the moment. That's why its in PoC state. But, this requires some further changes veb(4) and friends. I send this diff separately to show the additional complexity of TSO. diff --git a/sys/net/if_tun.c b/sys/net/if_tun.c index 7241db48d50..06690b85b23 100644 --- a/sys/net/if_tun.c +++ b/sys/net/if_tun.c @@ -107,7 +107,9 @@ int tundebug = TUN_DEBUG; #define TUN_IF_CAPS ( \ VIRTIO_NET_F_CSUM | \ - VIRTIO_NET_F_GUEST_CSUM \ + VIRTIO_NET_F_GUEST_CSUM | \ + VIRTIO_NET_F_HOST_TSO4 | \ + VIRTIO_NET_F_HOST_TSO6 \ ) void tunattach(int); @@ -182,8 +184,13 @@ struct virtio_net_hdr { #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* flags */ #define VIRTIO_NET_HDR_F_DATA_VALID 2 /* flags */ +#define VIRTIO_NET_HDR_GSO_TCPV4 1 /* gso_type */ +#define VIRTIO_NET_HDR_GSO_TCPV6 4 /* gso_type */ + #define VIRTIO_NET_F_CSUM (1ULL<<0) #define VIRTIO_NET_F_GUEST_CSUM (1ULL<<1) +#define VIRTIO_NET_F_HOST_TSO4 (1ULL<<11) +#define VIRTIO_NET_F_HOST_TSO6 (1ULL<<12) void tunattach(int n) @@ -678,6 +685,10 @@ tun_set_capabilities(struct tun_softc *sc, const struct tun_capabilities *cap) SET(sc->sc_if.if_capabilities, IFCAP_CSUM_UDPv4); SET(sc->sc_if.if_capabilities, IFCAP_CSUM_UDPv6); } + + if (ISSET(sc->sc_cap.tun_if_capabilities, VIRTIO_NET_F_HOST_TSO4) || + ISSET(sc->sc_cap.tun_if_capabilities, VIRTIO_NET_F_HOST_TSO6)) + SET(sc->sc_if.if_capabilities, IFCAP_LRO); NET_UNLOCK(); return (0); @@ -1039,6 +1050,12 @@ tun_dev_write(dev_t dev, struct uio *uio, int ioflag, int align) SET(m0->m_pkthdr.csum_flags, M_UDP_CSUM_IN_OK); } } + + if (ISSET(vh.flags, VIRTIO_NET_HDR_GSO_TCPV4) || + ISSET(vh.flags, VIRTIO_NET_HDR_GSO_TCPV6)) { + m->m_pkthdr.ph_mss = vh.gso_size; + SET(m0->m_pkthdr.csum_flags, M_TCP_TSO); + } } tun_input_process(ifp, m0); diff --git a/usr.sbin/vmd/vionet.c b/usr.sbin/vmd/vionet.c index c639add0225..3726efb05f1 100644 --- a/usr.sbin/vmd/vionet.c +++ b/usr.sbin/vmd/vionet.c @@ -306,7 +306,9 @@ vionet_update_offload(struct virtio_dev *dev) msg.irq = dev->irq; msg.type = VIODEV_MSG_TUNSCAP; msg.data = dev->driver_feature & - (VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM); + (VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM | + VIRTIO_NET_F_HOST_TSO4 | VIRTIO_NET_F_HOST_TSO6); + ret = imsg_compose_event2(&dev->async_iev, IMSG_DEVOP_MSG, 0, 0, -1, &msg, sizeof(msg), ev_base_main); diff --git a/usr.sbin/vmd/virtio.c b/usr.sbin/vmd/virtio.c index 00b770d8ec2..0eab5a2dbf9 100644 --- a/usr.sbin/vmd/virtio.c +++ b/usr.sbin/vmd/virtio.c @@ -1037,7 +1037,8 @@ virtio_init(struct vmd_vm *vm, int child_cdrom, virtio_dev_init(vm, dev, id, VIONET_QUEUE_SIZE_DEFAULT, VIRTIO_NET_QUEUES, (VIRTIO_NET_F_MAC | VIRTIO_NET_F_CSUM | - VIRTIO_NET_F_GUEST_CSUM | VIRTIO_F_VERSION_1)); + VIRTIO_NET_F_GUEST_CSUM | VIRTIO_NET_F_HOST_TSO4 + | VIRTIO_NET_F_HOST_TSO6 | VIRTIO_F_VERSION_1)); if (pci_add_bar(id, PCI_MAPREG_TYPE_IO, virtio_pci_io, dev) == -1) { diff --git a/usr.sbin/vmd/virtio.h b/usr.sbin/vmd/virtio.h index e261fec5f39..cc4c6837c90 100644 --- a/usr.sbin/vmd/virtio.h +++ b/usr.sbin/vmd/virtio.h @@ -317,6 +317,8 @@ struct virtio_net_hdr { #define VIRTIO_NET_F_CSUM (1<<0) #define VIRTIO_NET_F_GUEST_CSUM (1<<1) #define VIRTIO_NET_F_MAC (1<<5) +#define VIRTIO_NET_F_HOST_TSO4 (1<<11) +#define VIRTIO_NET_F_HOST_TSO6 (1<<12) enum vmmci_cmd { VMMCI_NONE = 0,