Index | Thread | Search

From:
Jan Klemkow <jan@openbsd.org>
Subject:
Re: vmd: add checksum offload for guests
To:
Dave Voutila <dv@sisu.io>
Cc:
David Gwynne <david@gwynne.id.au>, Mike Larkin <mlarkin@nested.page>, Klemens Nanni <kn@openbsd.org>, Alexander Bluhm <bluhm@openbsd.org>, tech@openbsd.org
Date:
Sat, 7 Feb 2026 00:28:05 +0100

Download raw body.

Thread
On Wed, Jan 28, 2026 at 11:00:30AM -0500, Dave Voutila wrote:
> David Gwynne <david@gwynne.id.au> writes:
> 
> > On Sat, Jan 17, 2026 at 10:56:36PM +0100, Jan Klemkow wrote:
> >> On Sat, Jan 17, 2026 at 11:38:50AM -0500, Dave Voutila wrote:
> >> > Mike Larkin <mlarkin@nested.page> writes:
> >> > > On Fri, Jan 16, 2026 at 07:38:16PM +0100, Jan Klemkow wrote:
> >> > >> On Thu, Jan 15, 2026 at 02:08:43PM -0800, Mike Larkin wrote:
> >> > >> > Does this "just work" no matter what guests I run? That's really all I care
> >> > >> > about.
> >> > >>
> >> > >> Here is my current diff for checksum offloading in vmd(8).
> >> > >>
> >> > >> I tested the following combination of features:
> >> > >>
> >> > >>  - Debian/Linux and OpenBSD-current guests
> >> > >>  - OpenBSD-current vio(4) w/o all offloading features
> >> > >>  - Linux, OpenBSD and Hostsystem via veb(4) and vlan(4)
> >> > >>  - IPv4 and IPv6 with tcpbench(1)
> >> > >>  - local interface locked lladdr
> >> > >>  - local interface dhcp
> >> > >>
> >> > >> Further tests are welcome!
> >> > >
> >> > > Not sure about dv@, but I can't really review this. it's hundreds of lines
> >> > > of changes in vmd vionet that require a level of understanding of tap(4) and
> >> > > in virtio/vionet (and the network stack in general) that I don't have.
> >> > > When I did the original vionet in vmd years ago it was pretty straightforward
> >> > > since the spec (for *all* virtio) was only like 20 pages. I was able to write
> >> > > that code in a weekend. now that we have bolted on all this other stuff, I
> >> > > don't feel comfortable giving oks in this area anymore since there is no way
> >> > > I can look at this and know if it's right or not. I think you need a network
> >> > > stack person to ok this, *and* explain what the ramifications are for vmd
> >> > > in general. It looks like vmd is doing inspection of every packet now? I
> >> > > dont think we want that.
> >> >
> >> > I've spent time digging into this and better understand it now. I'm also
> >> > happy now with how the current diff isn't expanding pledges for vionet.
> >> >
> >> > It feels overkill to have to poke every packet,
> >
> > if vmd is going to provide virtio net to the guest, and you want to
> > provide the offload features to the guest, then something on the
> > host side has to implement the virtio net header and fiddle with
> > the packets this way.
> >
> >> It don't have to be this way. My first versions of this diff was
> >> without all this packet parsing stuff in vmd(8)[1].  I'll try reproduce
> >> the old version till c2k25 to show you the difference.
> >>
> >> [1]: https://marc.info/?l=openbsd-tech&m=172381275602917
> >
> > that's not the whole story though. if vmd doesn't do it, then the kernel
> > has to. either way, the work has to be done, but i strongly believe
> > that it's better to handle the virtio net header in vmd from a whole
> > system perspective.
> >
> > i get that there's a desire to make vmd as thin and simple as
> > possible, but that doesn't make sense to me if the kernel has to
> > bear the cost of increased complexity and reduced flexibility
> > instead.
> 
> I'm ok with the design of putting the logic in vmd and agree with
> keeping that muck out of the kernel.
> 
> It seems like we need the TSO features to get performance gains. If
> that's the case, I'd rather we look at TSO and consider this part of the
> implementation. On its own, the checksum offload looks underwhelming.
> 
> If TSO is going to bring another layer of complexity it's best that
> surface now. :)

This diff brings TSO to vmd(8) and tun(4) depend on my last checksum
diff a few minutes ago.  It improves the guest to host network
performance.  In my setup it bumps it form 1.4 Gbit/s to 10/25 Gbit/s.

With local interfaces I got 25 Gbit/s and 10 Gbit/s without local
interfaces in my vm.conf(5).

The will not work properly with bridges/veb at the moment.  That's why
its in PoC state.  But, this requires some further changes veb(4) and
friends.

I send this diff separately to show the additional complexity of TSO.

diff --git a/sys/net/if_tun.c b/sys/net/if_tun.c
index 7241db48d50..06690b85b23 100644
--- a/sys/net/if_tun.c
+++ b/sys/net/if_tun.c
@@ -107,7 +107,9 @@ int	tundebug = TUN_DEBUG;
 
 #define TUN_IF_CAPS ( \
 	VIRTIO_NET_F_CSUM | \
-	VIRTIO_NET_F_GUEST_CSUM \
+	VIRTIO_NET_F_GUEST_CSUM | \
+	VIRTIO_NET_F_HOST_TSO4 | \
+	VIRTIO_NET_F_HOST_TSO6 \
 )
 
 void	tunattach(int);
@@ -182,8 +184,13 @@ struct virtio_net_hdr {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM	1 /* flags */
 #define VIRTIO_NET_HDR_F_DATA_VALID	2 /* flags */
 
+#define VIRTIO_NET_HDR_GSO_TCPV4	1 /* gso_type */
+#define VIRTIO_NET_HDR_GSO_TCPV6	4 /* gso_type */
+
 #define VIRTIO_NET_F_CSUM	(1ULL<<0)
 #define VIRTIO_NET_F_GUEST_CSUM	(1ULL<<1)
+#define VIRTIO_NET_F_HOST_TSO4	(1ULL<<11)
+#define VIRTIO_NET_F_HOST_TSO6	(1ULL<<12)
 
 void
 tunattach(int n)
@@ -678,6 +685,10 @@ tun_set_capabilities(struct tun_softc *sc, const struct tun_capabilities *cap)
 		SET(sc->sc_if.if_capabilities, IFCAP_CSUM_UDPv4);
 		SET(sc->sc_if.if_capabilities, IFCAP_CSUM_UDPv6);
 	}
+
+	if (ISSET(sc->sc_cap.tun_if_capabilities, VIRTIO_NET_F_HOST_TSO4) ||
+	    ISSET(sc->sc_cap.tun_if_capabilities, VIRTIO_NET_F_HOST_TSO6))
+		SET(sc->sc_if.if_capabilities, IFCAP_LRO);
 	NET_UNLOCK();
 
 	return (0);
@@ -1039,6 +1050,12 @@ tun_dev_write(dev_t dev, struct uio *uio, int ioflag, int align)
 				SET(m0->m_pkthdr.csum_flags, M_UDP_CSUM_IN_OK);
 			}
 		}
+
+		if (ISSET(vh.flags, VIRTIO_NET_HDR_GSO_TCPV4) ||
+		    ISSET(vh.flags, VIRTIO_NET_HDR_GSO_TCPV6)) {
+			m->m_pkthdr.ph_mss = vh.gso_size;
+			SET(m0->m_pkthdr.csum_flags, M_TCP_TSO);
+		}
 	}
 
 	tun_input_process(ifp, m0);
diff --git a/usr.sbin/vmd/vionet.c b/usr.sbin/vmd/vionet.c
index c639add0225..3726efb05f1 100644
--- a/usr.sbin/vmd/vionet.c
+++ b/usr.sbin/vmd/vionet.c
@@ -306,7 +306,9 @@ vionet_update_offload(struct virtio_dev *dev)
 	msg.irq = dev->irq;
 	msg.type = VIODEV_MSG_TUNSCAP;
 	msg.data = dev->driver_feature &
-	    (VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM);
+	    (VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM |
+	     VIRTIO_NET_F_HOST_TSO4 | VIRTIO_NET_F_HOST_TSO6);
+
 
 	ret = imsg_compose_event2(&dev->async_iev, IMSG_DEVOP_MSG, 0, 0, -1,
 	    &msg, sizeof(msg), ev_base_main);
diff --git a/usr.sbin/vmd/virtio.c b/usr.sbin/vmd/virtio.c
index 00b770d8ec2..0eab5a2dbf9 100644
--- a/usr.sbin/vmd/virtio.c
+++ b/usr.sbin/vmd/virtio.c
@@ -1037,7 +1037,8 @@ virtio_init(struct vmd_vm *vm, int child_cdrom,
 			virtio_dev_init(vm, dev, id, VIONET_QUEUE_SIZE_DEFAULT,
 			    VIRTIO_NET_QUEUES,
 			    (VIRTIO_NET_F_MAC | VIRTIO_NET_F_CSUM |
-				VIRTIO_NET_F_GUEST_CSUM | VIRTIO_F_VERSION_1));
+				VIRTIO_NET_F_GUEST_CSUM | VIRTIO_NET_F_HOST_TSO4
+				| VIRTIO_NET_F_HOST_TSO6 | VIRTIO_F_VERSION_1));
 
 			if (pci_add_bar(id, PCI_MAPREG_TYPE_IO, virtio_pci_io,
 			    dev) == -1) {
diff --git a/usr.sbin/vmd/virtio.h b/usr.sbin/vmd/virtio.h
index e261fec5f39..cc4c6837c90 100644
--- a/usr.sbin/vmd/virtio.h
+++ b/usr.sbin/vmd/virtio.h
@@ -317,6 +317,8 @@ struct virtio_net_hdr {
 #define VIRTIO_NET_F_CSUM	(1<<0)
 #define VIRTIO_NET_F_GUEST_CSUM	(1<<1)
 #define VIRTIO_NET_F_MAC	(1<<5)
+#define VIRTIO_NET_F_HOST_TSO4	(1<<11)
+#define VIRTIO_NET_F_HOST_TSO6	(1<<12)
 
 enum vmmci_cmd {
 	VMMCI_NONE = 0,