From: Martin Pieuchot Subject: Re: uvm_fault_upper_lookup: delay pmap_extract() To: Mark Kettenis Cc: tech@openbsd.org, tb@openbsd.org Date: Thu, 28 Nov 2024 17:18:36 +0100 On 28/11/24(Thu) 11:33, Mark Kettenis wrote: > > Date: Thu, 28 Nov 2024 10:01:15 +0100 > > From: Martin Pieuchot > > > > uvm_fault_upper_lookup() enters existing neighbor pages found in the > > amap. > > > > Diff below reorder the checks in the loop to avoid calling the expensive > > pmap_extract(9) when there is no corresponding anon or if the page is > > PG_RELEASED|PG_BUSY and won't be entered anyway. > > Out of curiosity, do you know what part of pmap_extract() is > expensive? Is it the locking? Or is it the PTE (descriptor) lookup? First of all it isn't the current spinning bottleneck which is the pageqlock. (I plan to tackle it soon now that the page daemon is in a testable shape). pmap_extract() accounts for ~3.3% and ~3.8% of the total amount of CPU time spent in the fault handler when building libLLVM with make -j24 on my machine with i9 (hot and cold numbers respectively). Almost all this time (>80%) is spent in the mutex. That is what makes it expensive compared to a pointer or flag comparison. With my diff these numbers drop to and ~3.1% and ~3.6% (hot and cold respectively) and the time spent in pmap_enter(9) also decreases from ~0.5%. Overall I see an ~1% improvement with the two changes. Are you ok with the diff? > > Index: uvm/uvm_fault.c > > =================================================================== > > RCS file: /cvs/src/sys/uvm/uvm_fault.c,v > > diff -u -p -r1.146 uvm_fault.c > > --- uvm/uvm_fault.c 27 Nov 2024 10:58:07 -0000 1.146 > > +++ uvm/uvm_fault.c 28 Nov 2024 08:35:10 -0000 > > @@ -841,10 +841,11 @@ uvm_fault_upper_lookup(struct uvm_faulti > > { > > struct vm_amap *amap = ufi->entry->aref.ar_amap; > > struct vm_anon *anon; > > + struct vm_page *pg; > > boolean_t shadowed; > > vaddr_t currva; > > paddr_t pa; > > - int lcv; > > + int lcv, entered = 0; > > > > /* locked: maps(read), amap(if there) */ > > KASSERT(amap == NULL || > > @@ -859,16 +860,6 @@ uvm_fault_upper_lookup(struct uvm_faulti > > shadowed = FALSE; > > for (lcv = 0; lcv < flt->npages; lcv++, currva += PAGE_SIZE) { > > /* > > - * dont play with VAs that are already mapped > > - * except for center) > > - */ > > - if (lcv != flt->centeridx && > > - pmap_extract(ufi->orig_map->pmap, currva, &pa)) { > > - pages[lcv] = PGO_DONTCARE; > > - continue; > > - } > > - > > - /* > > * unmapped or center page. check if any anon at this level. > > */ > > if (amap == NULL || anons[lcv] == NULL) { > > @@ -884,12 +875,20 @@ uvm_fault_upper_lookup(struct uvm_faulti > > shadowed = TRUE; > > continue; > > } > > + > > anon = anons[lcv]; > > + pg = anon->an_page; > > + > > KASSERT(anon->an_lock == amap->am_lock); > > - if (anon->an_page && > > - (anon->an_page->pg_flags & (PG_RELEASED|PG_BUSY)) == 0) { > > + > > + /* > > + * ignore busy pages. > > + * don't play with VAs that are already mapped. > > + */ > > + if (pg && (pg->pg_flags & (PG_RELEASED|PG_BUSY)) == 0 && > > + !pmap_extract(ufi->orig_map->pmap, currva, &pa)) { > > uvm_lock_pageq(); > > - uvm_pageactivate(anon->an_page); /* reactivate */ > > + uvm_pageactivate(pg); /* reactivate */ > > uvm_unlock_pageq(); > > counters_inc(uvmexp_counters, flt_namap); > > > > @@ -902,13 +901,14 @@ uvm_fault_upper_lookup(struct uvm_faulti > > * that we enter these right now. > > */ > > (void) pmap_enter(ufi->orig_map->pmap, currva, > > - VM_PAGE_TO_PHYS(anon->an_page) | flt->pa_flags, > > + VM_PAGE_TO_PHYS(pg) | flt->pa_flags, > > (anon->an_ref > 1) ? > > (flt->enter_prot & ~PROT_WRITE) : flt->enter_prot, > > PMAP_CANFAIL); > > + entered++; > > } > > } > > - if (flt->npages > 1) > > + if (entered > 0) > > pmap_update(ufi->orig_map->pmap); > > > > return shadowed; > > > > > >