From: Anton Lindqvist Subject: Re: km_alloc(9), UAREA and kv_pageable To: Mark Kettenis Cc: mpi@grenadille.net, miod@openbsd.org, tech@openbsd.org Date: Thu, 24 Oct 2024 06:01:52 +0200 On Tue, Oct 22, 2024 at 11:15:52PM +0200, Mark Kettenis wrote: > > Date: Sun, 20 Oct 2024 21:42:02 +0200 > > From: Mark Kettenis > > > > > Date: Sun, 20 Oct 2024 19:47:23 +0200 > > > From: Martin Pieuchot > > > > > > On 20/10/24(Sun) 14:36, Mark Kettenis wrote: > > > > > Date: Sun, 20 Oct 2024 11:37:57 +0200 > > > > > From: Martin Pieuchot > > > > > > > > Sorry, I didn't have time yet to investigate. But I'd rather not > > > > change the km_alloc() API if it is not necessary, especially when the > > > > it introduces a somewhat vague concept like "managed" that doesn't > > > > even make sense in the context where it gets used. > > > > > > > > > On 10/10/24(Thu) 10:40, Martin Pieuchot wrote: > > > > > > On 09/10/24(Wed) 13:39, Mark Kettenis wrote: > > > > > > > > Date: Wed, 9 Oct 2024 13:16:25 +0200 > > > > > > > > From: Martin Pieuchot > > > > > > > > > > > > > > > > On 02/10/24(Wed) 17:28, Martin Pieuchot wrote: > > > > > > > > > It is currently impossible to use km_alloc(9) to get a managed mapping > > > > > > > > > backed with physical pages allocated up front. I guess this is why > > > > > > > > > uvm_km_kmemalloc_pla() is still used to allocate the UAREA. > > > > > > > > > > > > > > > > > > To fix that, I suggest the diff below which turns the "kp_pageable" flag > > > > > > > > > of km_alloc(9) into a "kp_managed". This new flag no longer implies > > > > > > > > > "kp_nomem" so I updated the description and fixed the remaining XXX to > > > > > > > > > preserve existing behavior of the `kp_pageable' global. > > > > > > > > > > > > > > > > > > With this change I believe we could also get rid of uvm_km_zalloc() used > > > > > > > > > in the i386 pmap. > > > > > > > > > > > > > > > > > > Comments? Oks? > > > > > > > > > > > > > > > > Anyone? > > > > > > > > > > > > > > I don't really understand why this needs to be "managed" memory. We > > > > > > > stopped paging out the uarea ages ago, and I wonder if all this > > > > > > > complexity is just from the time when we did. Can't this just be > > > > > > > kp_zero memory? > > > > > > > > > > > > On amd64 using kp_zero results in the kernel faulting in loop from which > > > > > > it cannot recover. Could you please look at it and tell me if there is a > > > > > > bug somewhere else? > > > > > > > > Since the memory is no longer "managed", setguardpage() needs to use > > > > pmap_kremove() instead of pmap_remove(). > > > > > > Nice found! > > > > > > > The following diff works for me on both amd64 and arm64. > > > > > > Lovely, should we test this on more architectures? Or are you confident > > > with it as it is? > > > > Looks like amd64 is the only architecture that does that guardpage > > thing. So I think this is fine. > > However, I think this diff (that is committed now) leaks a page for > each proc that we create. This didn't happen with the old code > because the kernel object kept track of the pages that we removed > using pmap_remove(), and those pages would be free when the kernel > stack was unmapped using uvm_unmap(). > > This no longer happens when we use km_alloc(), so we have to free that > page ourselves. > > ok? I have experienced my amd64 regress machine becoming unstable after this went in. Programs dying seemingly random with this type of failure: ld.so: dev-limit: free(): error: chunk canary corrupted The proposed correction solves this issue; ok anton@ fwiw > > P.S. I have a diff that moves this code to MI code that I'm currently > testing. But this fix should go in first. > > > Index: arch/amd64/amd64/vm_machdep.c > =================================================================== > RCS file: /cvs/src/sys/arch/amd64/amd64/vm_machdep.c,v > diff -u -p -r1.48 vm_machdep.c > --- arch/amd64/amd64/vm_machdep.c 21 Oct 2024 18:27:34 -0000 1.48 > +++ arch/amd64/amd64/vm_machdep.c 22 Oct 2024 21:05:16 -0000 > @@ -135,8 +135,15 @@ cpu_exit(struct proc *p) > void > setguardpage(struct proc *p) > { > + struct vm_page *pg = NULL; > + paddr_t pa; > + > + if (pmap_extract(pmap_kernel(), (vaddr_t)p->p_addr + PAGE_SIZE, &pa)) > + pg = PHYS_TO_VM_PAGE(pa); > pmap_kremove((vaddr_t)p->p_addr + PAGE_SIZE, PAGE_SIZE); > pmap_update(pmap_kernel()); > + if (pg) > + uvm_pagefree(pg); > } > > struct kmem_va_mode kv_physwait = { > >