Index | Thread | Search

From:
Brent Cook <busterb@gmail.com>
Subject:
Re: Sysupgrade on i386 vmm to current snapshot hangs on boot into upgrade kernel
To:
Dave Voutila <dv@sisu.io>
Cc:
Mike Larkin <mlarkin@nested.page>, Alexander Bluhm <bluhm@openbsd.org>, tech@openbsd.org
Date:
Mon, 6 Apr 2026 08:46:02 -0500

Download raw body.

Thread
On Mon, Apr 6, 2026 at 7:06 AM Dave Voutila <dv@sisu.io> wrote:

> Brent Cook <busterb@gmail.com> writes:
>
> > On Sun, Apr 5, 2026 at 2:36 PM Mike Larkin <mlarkin@nested.page> wrote:
> >
> >  On Sun, Apr 05, 2026 at 01:08:51PM -0500, Brent Cook wrote:
> >  > On Sun, Apr 05, 2026 at 04:58:11AM -0500, Brent Cook wrote:
> >  > > On Tue, Mar 31, 2026 at 1:35 PM Alexander Bluhm <bluhm@openbsd.org>
> wrote:
> >  > >
> >  > > > On Tue, Mar 31, 2026 at 09:30:30AM -0700, Mike Larkin wrote:
> >  > > > > rebuild the host kernel with HZ=1000 and see if running that
> makes the
> >  > > > problem
> >  > > > > go away.
> >  > > >
> >  > > > I have option HZ=1000 in host's sys/conf/GENERIC now.  It does
> not help.
> >  > > > Guest's GENERIC is unmodified.
> >  > > >
> >  > > > When the i386 guest boots, its GENERIC either says
> >  > > > cpu0: AMD EPYC 73F3 16-Core Processor ("AuthenticAMD" 686-class,
> 512KB L2
> >  > > > cache) 3.50 GHz, 19-01-01
> >  > > > or
> >  > > > cpu0: AMD EPYC 73F3 16-Core Processor ("AuthenticAMD" 686-class,
> 512KB L2
> >  > > > cache) 74 MHz, 19-01-01
> >  > > >
> >  > > >
> >  > > Same thing here, but this is a clue why RAMDISK_CD hangs but not
> GENERIC
> >  > > are hanging, at least for me.
> >  > >
> >  > > Adding pvclock0 allows RAMDISK_CD to boot 100% of the time in my
> local vmm
> >  > > machine.
> >  > >
> >  > > diff --git a/sys/arch/i386/conf/RAMDISK_CD
> b/sys/arch/i386/conf/RAMDISK_CD
> >  > > index c00f6dcbbb4..24bd4e465a9 100644
> >  > > --- a/sys/arch/i386/conf/RAMDISK_CD
> >  > > +++ b/sys/arch/i386/conf/RAMDISK_CD
> >  > > @@ -28,6 +28,7 @@ config                bsd root on rd0a swap on
> rd0b and
> >  > > wd0b and sd0b
> >  > >  mainbus0       at root
> >  > >
> >  > >  pvbus0         at mainbus0     # Paravirtual device bus
> >  > > +pvclock0       at pvbus?       # KVM/VMD paravirtual clock
> >  > >
> >  > >  acpi0          at bios?
> >  > >  #acpitimer*    at acpi?
> >  > >
> >  > >
> >  >
> >  > Ugh, that last paste was ugly, sorry. Switching mail clients:
> >  >
> >  > Would anyone be able to give this diff a try? It's working locally; I
> >  > was able reliably get the i8254 to calibrate to 4GHz, and see no hangs
> >  > either in RAMDISK_CD or when switching sysctl
> >  > kern.timecounter.hardware=i8254. Now my i386 VMM machine can
> >  > sysupgrade reliably
> >  >
> >  > The main change here is to set the timestamp right in the vcpu thread
> >  > instead of later in the event loop in i8253_reset. Otherwise, a latch
> >  > command issued before i8253_reset runs will be using a new start
> value,
> >  > but an old stale ts value. Maybe this makes the clock_gettime in
> >  > i8253_reset unnecessary, but I left it in.
> >  >
> >  > This also adjusts a couple of the register states to match that of
> >  > Bhyve when the mode is updated or there is a latch command:
> >  >
> >  >
> https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/vmm/io/vatpit.c#L239
> >  >
> https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/vmm/io/vatpit.c#L330
> >  >
> >  > Hopefully this makes sense, or if it's the wrong approach, at least
> >  > inspires the correct fix.
> >  >
> >
> >  this makes sense to me; I think the key fix here is the resetting of
> last_r
> >  to 2, indicating that 2 bytes must be read. this makes sense after a
> latch
> >  and makes more sense WRT the 8253 state machine.
> >
> >  The other bits may or may not be needed but they seem correct also
> and/or
> >  harmless, so I'd say leave them in for consistency.
> >
> >  I don't have any machines that exhibit the issue so as long as you get
> this
> >  tested on some various guest OSes, ok mlarkin. (eg test linux, openbsd
> amd64
> >  and openbsd i386, make sure nothing gets broken)
> >
> >  -ml
> >
> > Thanks, so far everything is looking good on Debian i386/amd64, Alpine
> x86_64, and the OpenBSDs.
> >
> > Alpine x86 32-bit is dying when booting the installer with what looks
> like a memory probe that isn't handled yet by vmm.
> >
> > vmx_handle_np_fault: unknown memory type 2 for GPA 0xc4e89a0c
> >
> >  But I think that is a pre-existing issue.
>
> Yeah that looks like the result of unemulated mmio and you can disregard
> that for moving forward with this diff. Can you share the Alpine version
> and if it's iso vs. disk image?
>

It was the ISO, started like so:

doas vmctl start -m 1G -L -i 1 -c -r alpine-standard-3.23.3-x86.iso -d
alpine-x86.qcow
alpine
-x86 Giving it 4G fills in the memory space though, and allows it to boot
successfully.


>
> Comment below though...
>
> >  > diff --git a/usr.sbin/vmd/i8253.c b/usr.sbin/vmd/i8253.c
> >  > index 9e32d9382b6..091716a2f32 100644
> >  > --- a/usr.sbin/vmd/i8253.c
> >  > +++ b/usr.sbin/vmd/i8253.c
> >  > @@ -262,12 +262,14 @@ vcpu_exit_i8253(struct vm_run_params *vrp)
> >  >                                           ticks %
> i8253_channel[sel].start;
> >  >                               } else
> >  >                                       i8253_channel[sel].olatch = 0;
> >  > +                             i8253_channel[sel].last_r = 2;
>
> I'm not convinced the above line is correct given how last_r and last_w
> are working in i8253.c...but they themselves may be incorrect. Both
> fields are used only to index the last byte of the 2-byte register being
> read or written and values other than 0 and 1 are never tested.
>
> Apparently we're initializing last_r to 1. Checks are only done for if
> last_r == 0 or non-zero.
>
> Given that, I think this _should_ be set to 1, not 2.
>
> Honestly this could just be a big mistake in the emulation...but
> semantically 2 doesn't fit the rest of the code.
>

Yeah, got it - it should be a 1 here. I got things confused when comparing
different i8253 emulations.

vmm is using it a binary state value, while bhyve uses 'olbyte' as a kind
of tri-state, where when it gets to 0, the output latch follows the counter
instead of shifting bytes from the last latched value:

https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/vmm/io/vatpit.c#L396


>
> >  >                               goto ret;
> >  >                       } else if (rw != TIMER_16BIT) {
> >  >                               log_warnx("%s: i8253 PIT: unsupported
> counter "
> >  >                                   "%d rw mode 0x%x selected",
> __func__,
> >  >                                   sel, (rw & TIMER_16BIT));
> >  >                       }
> >  > +                     i8253_channel[sel].last_w = 0;
> >  >                       i8253_channel[sel].mode = (out_data & 0xe) >> 1;
> >  >
> >  >                       goto ret;
> >  > @@ -293,6 +295,9 @@ vcpu_exit_i8253(struct vm_run_params *vrp)
> >  >                               if (i8253_channel[sel].start == 0)
> >  >                                       i8253_channel[sel].start =
> 0xffff;
> >  >
> >  > +                             clock_gettime(CLOCK_MONOTONIC,
> >  > +                                 &i8253_channel[sel].ts);
> >  > +
> >  >                               DPRINTF("%s: channel %d reset, mode=%d,
> "
> >  >                                   "start=%d\n", __func__,
> >  >                                   sel, i8253_channel[sel].mode,
>