Index | Thread | Search

From:
Martin Pieuchot <mpi@grenadille.net>
Subject:
Re: km_alloc(9), UAREA and kv_pageable
To:
Mark Kettenis <mark.kettenis@xs4all.nl>
Cc:
miod@openbsd.org, tech@openbsd.org
Date:
Wed, 23 Oct 2024 09:05:55 +0200

Download raw body.

Thread
  • Mark Kettenis:

    km_alloc(9), UAREA and kv_pageable

  • On 22/10/24(Tue) 23:15, Mark Kettenis wrote:
    > > Date: Sun, 20 Oct 2024 21:42:02 +0200
    > > From: Mark Kettenis <mark.kettenis@xs4all.nl>
    > > 
    > > > Date: Sun, 20 Oct 2024 19:47:23 +0200
    > > > From: Martin Pieuchot <mpi@grenadille.net>
    > > > 
    > > > On 20/10/24(Sun) 14:36, Mark Kettenis wrote:
    > > > > > Date: Sun, 20 Oct 2024 11:37:57 +0200
    > > > > > From: Martin Pieuchot <mpi@grenadille.net>
    > > > > 
    > > > > Sorry, I didn't have time yet to investigate.  But I'd rather not
    > > > > change the km_alloc() API if it is not necessary, especially when the
    > > > > it introduces a somewhat vague concept like "managed" that doesn't
    > > > > even make sense in the context where it gets used.
    > > > > 
    > > > > > On 10/10/24(Thu) 10:40, Martin Pieuchot wrote:
    > > > > > > On 09/10/24(Wed) 13:39, Mark Kettenis wrote:
    > > > > > > > > Date: Wed, 9 Oct 2024 13:16:25 +0200
    > > > > > > > > From: Martin Pieuchot <mpi@grenadille.net>
    > > > > > > > > 
    > > > > > > > > On 02/10/24(Wed) 17:28, Martin Pieuchot wrote:
    > > > > > > > > > It is currently impossible to use km_alloc(9) to get a managed mapping
    > > > > > > > > > backed with physical pages allocated up front.  I guess this is why
    > > > > > > > > > uvm_km_kmemalloc_pla() is still used to allocate the UAREA.
    > > > > > > > > > 
    > > > > > > > > > To fix that, I suggest the diff below which turns the "kp_pageable" flag
    > > > > > > > > > of km_alloc(9) into a "kp_managed".  This new flag no longer implies
    > > > > > > > > > "kp_nomem" so I updated the description and fixed the remaining XXX to
    > > > > > > > > > preserve existing behavior of the `kp_pageable' global.
    > > > > > > > > > 
    > > > > > > > > > With this change I believe we could also get rid of uvm_km_zalloc() used
    > > > > > > > > > in the i386 pmap.
    > > > > > > > > > 
    > > > > > > > > > Comments?  Oks?
    > > > > > > > > 
    > > > > > > > > Anyone?
    > > > > > > > 
    > > > > > > > I don't really understand why this needs to be "managed" memory.  We
    > > > > > > > stopped paging out the uarea ages ago, and I wonder if all this
    > > > > > > > complexity is just from the time when we did.  Can't this just be
    > > > > > > > kp_zero memory?
    > > > > > > 
    > > > > > > On amd64 using kp_zero results in the kernel faulting in loop from which
    > > > > > > it cannot recover.  Could you please look at it and tell me if there is a
    > > > > > > bug somewhere else?
    > > > > 
    > > > > Since the memory is no longer "managed", setguardpage() needs to use
    > > > > pmap_kremove() instead of pmap_remove().
    > > > 
    > > > Nice found!
    > > > 
    > > > > The following diff works for me on both amd64 and arm64.
    > > > 
    > > > Lovely, should we test this on more architectures?  Or are you confident
    > > > with it as it is?
    > > 
    > > Looks like amd64 is the only architecture that does that guardpage
    > > thing.  So I think this is fine.
    > 
    > However, I think this diff (that is committed now) leaks a page for
    > each proc that we create.  This didn't happen with the old code
    > because the kernel object kept track of the pages that we removed
    > using pmap_remove(), and those pages would be free when the kernel
    > stack was unmapped using uvm_unmap().
    > 
    > This no longer happens when we use km_alloc(), so we have to free that
    > page ourselves.
    > 
    > ok?
    
    I find it easier to understand with an intermediate variable `va' used 
    for both pmap_extract() and pmap_kremove(), like:
    
    va = (vaddr_t)p->p_addr + PAGE_SIZE;
    
    Ok mpi@ either way.
    
    > P.S. I have a diff that moves this code to MI code that I'm currently
    >      testing.  But this fix should go in first.
    
    Lovely!
    
    > Index: arch/amd64/amd64/vm_machdep.c
    > ===================================================================
    > RCS file: /cvs/src/sys/arch/amd64/amd64/vm_machdep.c,v
    > diff -u -p -r1.48 vm_machdep.c
    > --- arch/amd64/amd64/vm_machdep.c	21 Oct 2024 18:27:34 -0000	1.48
    > +++ arch/amd64/amd64/vm_machdep.c	22 Oct 2024 21:05:16 -0000
    > @@ -135,8 +135,15 @@ cpu_exit(struct proc *p)
    >  void
    >  setguardpage(struct proc *p)
    >  {
    > +	struct vm_page *pg = NULL;
    > +	paddr_t pa;
    > +
    > +	if (pmap_extract(pmap_kernel(), (vaddr_t)p->p_addr + PAGE_SIZE, &pa))
    > +		pg = PHYS_TO_VM_PAGE(pa);
    >  	pmap_kremove((vaddr_t)p->p_addr + PAGE_SIZE, PAGE_SIZE);
    >  	pmap_update(pmap_kernel());
    > +	if (pg)
    > +		uvm_pagefree(pg);
    >  }
    >  
    >  struct kmem_va_mode kv_physwait = {
    > 
    > 
    
    
    
  • Mark Kettenis:

    km_alloc(9), UAREA and kv_pageable