From: Mark Kettenis Subject: Re: Document vm_page locking To: Martin Pieuchot Cc: tech@openbsd.org Date: Wed, 19 Feb 2025 14:26:48 +0100 > Date: Tue, 18 Feb 2025 12:08:37 +0100 > From: Martin Pieuchot > > On 17/02/25(Mon) 19:40, Mark Kettenis wrote: > > > Date: Mon, 17 Feb 2025 16:45:48 +0100 > > > From: Martin Pieuchot > > > > > > Diff below documents which fields from `vm_page' are protected by which > > > lock/mechanism. > > > > > > I left out fields used by pmaps and `pg_version' which is only for debug > > > purpose and not needed for what I need to achieve: reduce the contention > > > on the `pageqlock'. > > > > > > ok? > > > > Why are you changing some of the types from u_int -> uint32_t? > > In general I do such change because uint32_t is C99 and explicitly > encode its size. Well, u_int is just our shorthand for unsigned int and that's C99 too. IMHO, encoding the size only really makes sense if the size of the type is actually relevant for the code. I think that is somewhat questionable for counters like pg_version and wire_count. Maybe a bit more relevant for flags like pg_flags where you need to make sure you have enough bits to store all the flag bits. However atomic_setbits_int(9) and atomic_clearbits_int(9) that we use to manipulate the flags use "unsigned int *", so uint32_t simply is the wrong type and it would be inconsistent with the rest of the kernel. > Here it is to reduce the diff with NetBSD which use such type, has a > different order of fields and some comments about in which cache line > on LP64 they fit in. > > Such pieces of information are helping me investigate what can be done > to improve the performances of the VM. > > > > Index: uvm/uvm_page.h > > > =================================================================== > > > RCS file: /cvs/src/sys/uvm/uvm_page.h,v > > > diff -u -p -r1.71 uvm_page.h > > > --- uvm/uvm_page.h 13 May 2024 01:15:53 -0000 1.71 > > > +++ uvm/uvm_page.h 2 Feb 2025 19:06:19 -0000 > > > @@ -85,28 +85,31 @@ > > > * and offset to which this page belongs (for pageout), > > > * and sundry status bits. > > > * > > > - * Fields in this structure are possibly locked by the lock on the page > > > - * queues (P). > > > + * Locks used to protect struct members in this file: > > > + * I immutable after creation > > > + * a atomic operations > > > + * Q uvm.pageqlock > > > + * F uvm.fpageqlock > > > + * o owner lock (uobject->vmobjlock or uanon->an_lock) > > > */ > > > > > > TAILQ_HEAD(pglist, vm_page); > > > > > > struct vm_page { > > > - TAILQ_ENTRY(vm_page) pageq; /* queue info for FIFO > > > - * queue or free list (P) */ > > > - RBT_ENTRY(vm_page) objt; /* object tree */ > > > + TAILQ_ENTRY(vm_page) pageq; /* [Q] LRU or free page queue */ > > > + RBT_ENTRY(vm_page) objt; /* [o] object tree */ > > > > > > - struct vm_anon *uanon; /* anon (P) */ > > > - struct uvm_object *uobject; /* object (P) */ > > > - voff_t offset; /* offset into object (P) */ > > > + struct vm_anon *uanon; /* [o] anon */ > > > + struct uvm_object *uobject; /* [o] object */ > > > + voff_t offset; /* [o] offset into object */ > > > > > > - u_int pg_flags; /* object flags [P] */ > > > + uint32_t pg_flags; /* [a] object flags */ > > > > > > - u_int pg_version; /* version count */ > > > - u_int wire_count; /* wired down map refs [P] */ > > > + uint32_t pg_version; /* version count */ > > > + uint32_t wire_count; /* [o] wired down map refs */ > > > > > > - paddr_t phys_addr; /* physical address of page */ > > > - psize_t fpgsz; /* free page range size */ > > > + paddr_t phys_addr; /* [I] physical address */ > > > + psize_t fpgsz; /* [F] free page range size */ > > > > > > struct vm_page_md mdpage; /* pmap-specific data */ > > > > > > @@ -287,9 +290,9 @@ int vm_physseg_find(paddr_t, int *); > > > > > > #define UVM_PAGEZERO_TARGET (uvmexp.free / 8) > > > > > > -#define VM_PAGE_TO_PHYS(entry) ((entry)->phys_addr) > > > +#define VM_PAGE_TO_PHYS(pg) ((pg)->phys_addr) > > > > > > -#define VM_PAGE_IS_FREE(entry) ((entry)->pg_flags & PQ_FREE) > > > +#define VM_PAGE_IS_FREE(pg) ((pg)->pg_flags & PQ_FREE) > > > > > > #define PADDR_IS_DMA_REACHABLE(paddr) \ > > > (dma_constraint.ucr_low <= paddr && dma_constraint.ucr_high > paddr) > > > > > > > > > > > >