From: David Gwynne Subject: Re: per-CPU page caches for page faults To: Martin Pieuchot Cc: Openbsd Tech Date: Mon, 1 Apr 2024 12:15:17 +1000 > On 1 Apr 2024, at 03:00, Martin Pieuchot wrote: > > On 19/03/24(Tue) 15:06, David Gwynne wrote: >> On Mon, Mar 18, 2024 at 08:13:43PM +0100, Martin Pieuchot wrote: >>> Diff below attaches a 16 page array to the "struct cpuinfo" and uses it >>> as a cache to reduce contention on the global pmemrange mutex. >>> >>> Measured performance improvements are between 7% to 13% with 16 CPUs >>> and 19% to 33% with 32 CPUs. -current OpenBSD doesn't scale above 32 >>> CPUs so it wouldn't be fair to compare number of jobs spread across >>> more CPUs. However, as you can see below, this limitation is no longer >>> true with this diff. >>> >>> kernel >>> ------ >>> 16: 1m47.93s real 11m24.18s user 10m55.78s system >>> 32: 2m33.30s real 11m46.08s user 32m32.35s system (BC cold) >>> 2m02.36s real 11m55.12s user 21m40.66s system >>> 64: 2m00.72s real 11m59.59s user 25m47.63s system >>> >>> libLLVM >>> ------- >>> 16: 30m45.54s real 363m25.35s user 150m34.05s system >>> 32: 24m29.88s real 409m49.80s user 311m02.54s system >>> 64: 29m22.63s real 404m16.20s user 771m31.26s system >>> 80: 30m12.49s real 398m07.01s user 816m01.71s system >>> >>> kernel+percpucaches(16) >>> ------ >>> 16: 1m30.17s real 11m19.29s user 6m42.08s system >>> 32: 2m02.28s real 11m42.13s user 23m42.64s system (BC cold) >>> 1m22.82s real 11m41.72s user 8m50.12s system >>> 64: 1m23.47s real 11m56.99s user 9m42.00s system >>> 80: 1m24.63s real 11m44.24s user 10m38.00s system >>> >>> libLLVM+percpucaches(16) >>> ------- >>> 16: 28m38.73s real 363m34.69s user 95m45.68s system >>> 32: 19m57.71s real 415m17.23s user 174m47.83s system >>> 64: 18m59.50s real 450m17.79s user 406m05.42s system >>> 80: 19m02.26s real 452m35.11s user 473m09.05s system >>> >>> Still the most important impact of this diff is the reduction of %sys >>> time. It drops from ~40% with 16 CPUs and ~55% with 32 CPUs or more. >>> >>> What is the idea behind this diff? With a consequent number of CPUs (16 >>> or more) grabbing a global mutex for every page allocation & free creates >>> a lot of contention resulting in many CPU cycles wasted in system (kernel) >>> time. The idea of this diff is to add another layer on top of the global >>> allocator to allocate and free pages in batch. Note that, in this diff, >>> this cache is only used for page faults. >>> >>> The number of 16 has been chosen after careful testing on a 80 CPU Ampere >>> machine. It tried to keep it as small as possible while making sure that >>> multiple parallel page faults on a large number of CPUs do not result in >>> contention. I'd argue that "stealing" at most 64k per CPU is acceptable >>> on any MP system. >>> >>> The diff includes 3 new counters visible in "systat uvm" and "vmstat -s". >>> >>> When the page daemon kicks in we drain the cache of the current CPU which >>> is the best we can do without adding too much complexity. >>> >>> I only tested amd64 and arm64, that's why there is such define in >>> uvm/uvm_page.c. I'd be happy to hear from tests on other architectures >>> and different topologies. You'll need to edit $arch/include/cpu.h and >>> modify the define. >>> >>> This diff is really interesting because it now allows us to clearly see >>> which syscall are contenting a lot. Without surprise it's kbind(2), >>> munmap(2) and mprotect(2). It also shows which workloads are VFS-bound. >>> That is what the "Buffer-Cache Cold" (BC Cold) numbers represent above. >>> With a small number of CPUs we don't see much difference between the two. >>> >>> Comments? >> >> i like the idea, and i like the improvements. >> >> this is basically the same problem that jeff bonwick deals with in >> his magazines and vmem paper about the changes he made to the solaris >> slab allocator to make it scale on machines with a bunch of cpus. >> that's the reference i used when i implemented per cpu caches in >> pools, and it's probably worth following here as well. the only >> real change i'd want you to make is to introduce the "previously >> loaded magazine" to mitigate thrashing as per section 3.1 in the >> paper. >> >> pretty exciting though. > > New version that should address all previous comments: > > - Use 2 magazines of 8 pages and imitate the pool_cache code. The > miss/hit ratio can be observed to be 1/8 with "systat uvm". > > - Ensure that uvm_pmr_getpages() won't fail with highly fragmented > memory and do not wakup the pagedaemon if it fails to fully reload a > magazine. > > - Use __HAVE_UVM_PERCPU & provide UP versions of cache_get/cache_put(). > > - Change amap_wipeout() to call uvm_anfree() to fill the cache instead of > bypassing it by calling uvm_pglistfree(). > > - Include a fix for incorrect decrementing of `uvm.swpgonly' in > uvm_anon_release() (should be committed independently). > > I didn't do any measurement with this version but robert@ said it shave > off 30 minutes compared to the previous one for a chromium build w/ 32 > CPUs (from 4.5h down to 4h). so a chromium build with your first diff is 4.5h? or a vanilla kernel is 4.5h? > > Comments? Tests? > > Index: usr.bin/systat/uvm.c > =================================================================== > RCS file: /cvs/src/usr.bin/systat/uvm.c,v > diff -u -p -r1.6 uvm.c > --- usr.bin/systat/uvm.c 27 Nov 2022 23:18:54 -0000 1.6 > +++ usr.bin/systat/uvm.c 29 Mar 2024 20:56:32 -0000 > @@ -80,11 +80,10 @@ struct uvmline uvmline[] = { > { &uvmexp.zeropages, &last_uvmexp.zeropages, "zeropages", > &uvmexp.pageins, &last_uvmexp.pageins, "pageins", > &uvmexp.fltrelckok, &last_uvmexp.fltrelckok, "fltrelckok" }, > - { &uvmexp.reserve_pagedaemon, &last_uvmexp.reserve_pagedaemon, > - "reserve_pagedaemon", > + { &uvmexp.percpucaches, &last_uvmexp.percpucaches, "percpucaches", > &uvmexp.pgswapin, &last_uvmexp.pgswapin, "pgswapin", > &uvmexp.fltanget, &last_uvmexp.fltanget, "fltanget" }, > - { &uvmexp.reserve_kernel, &last_uvmexp.reserve_kernel, "reserve_kernel", > + { NULL, NULL, NULL, > &uvmexp.pgswapout, &last_uvmexp.pgswapout, "pgswapout", > &uvmexp.fltanretry, &last_uvmexp.fltanretry, "fltanretry" }, > { NULL, NULL, NULL, > @@ -143,13 +142,13 @@ struct uvmline uvmline[] = { > NULL, NULL, NULL }, > { &uvmexp.pagesize, &last_uvmexp.pagesize, "pagesize", > &uvmexp.pdpending, &last_uvmexp.pdpending, "pdpending", > - NULL, NULL, NULL }, > + NULL, NULL, "Per-CPU Counters" }, > { &uvmexp.pagemask, &last_uvmexp.pagemask, "pagemask", > &uvmexp.pddeact, &last_uvmexp.pddeact, "pddeact", > - NULL, NULL, NULL }, > + &uvmexp.pcphit, &last_uvmexp.pcphit, "pcphit" }, > { &uvmexp.pageshift, &last_uvmexp.pageshift, "pageshift", > NULL, NULL, NULL, > - NULL, NULL, NULL } > + &uvmexp.pcpmiss, &last_uvmexp.pcpmiss, "pcpmiss" } > }; > > field_def fields_uvm[] = { > Index: usr.bin/vmstat/vmstat.c > =================================================================== > RCS file: /cvs/src/usr.bin/vmstat/vmstat.c,v > diff -u -p -r1.155 vmstat.c > --- usr.bin/vmstat/vmstat.c 4 Dec 2022 23:50:50 -0000 1.155 > +++ usr.bin/vmstat/vmstat.c 29 Mar 2024 20:56:32 -0000 > @@ -513,7 +513,12 @@ dosum(void) > uvmexp.reserve_pagedaemon); > (void)printf("%11u pages reserved for kernel\n", > uvmexp.reserve_kernel); > + (void)printf("%11u pages in per-cpu caches\n", > + uvmexp.percpucaches); > > + /* per-cpu cache */ > + (void)printf("%11u per-cpu cache hits\n", uvmexp.pcphit); > + (void)printf("%11u per-cpu cache misses\n", uvmexp.pcpmiss); > /* swap */ > (void)printf("%11u swap pages\n", uvmexp.swpages); > (void)printf("%11u swap pages in use\n", uvmexp.swpginuse); > Index: uvm/uvm_amap.c > =================================================================== > RCS file: /cvs/src/sys/uvm/uvm_amap.c,v > diff -u -p -r1.92 uvm_amap.c > --- sys/uvm/uvm_amap.c 11 Apr 2023 00:45:09 -0000 1.92 > +++ sys/uvm/uvm_amap.c 30 Mar 2024 17:30:10 -0000 > @@ -482,7 +482,6 @@ amap_wipeout(struct vm_amap *amap) > int slot; > struct vm_anon *anon; > struct vm_amap_chunk *chunk; > - struct pglist pgl; > > KASSERT(rw_write_held(amap->am_lock)); > KASSERT(amap->am_ref == 0); > @@ -495,7 +494,6 @@ amap_wipeout(struct vm_amap *amap) > return; > } > > - TAILQ_INIT(&pgl); > amap_list_remove(amap); > > AMAP_CHUNK_FOREACH(chunk, amap) { > @@ -515,12 +513,10 @@ amap_wipeout(struct vm_amap *amap) > */ > refs = --anon->an_ref; > if (refs == 0) { > - uvm_anfree_list(anon, &pgl); > + uvm_anfree(anon); > } > } > } > - /* free the pages */ > - uvm_pglistfree(&pgl); > > /* > * Finally, destroy the amap. > Index: sys/uvm/uvm_anon.c > =================================================================== > RCS file: /cvs/src/sys/uvm/uvm_anon.c,v > diff -u -p -r1.57 uvm_anon.c > --- sys/uvm/uvm_anon.c 27 Oct 2023 19:13:51 -0000 1.57 > +++ sys/uvm/uvm_anon.c 30 Mar 2024 09:21:19 -0000 > @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st > uvm_unlock_pageq(); /* free the daemon */ > } > } else { > - if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) { > + if (anon->an_swslot > 0) { > /* This page is no longer only in swap. */ > KASSERT(uvmexp.swpgonly > 0); > atomic_dec_int(&uvmexp.swpgonly); > @@ -260,7 +260,8 @@ uvm_anon_release(struct vm_anon *anon) > uvm_unlock_pageq(); > KASSERT(anon->an_page == NULL); > lock = anon->an_lock; > - uvm_anfree(anon); > + uvm_anon_dropswap(anon); > + pool_put(&uvm_anon_pool, anon); > rw_exit(lock); > /* Note: extra reference is held for PG_RELEASED case. */ > rw_obj_free(lock); > Index: sys/uvm/uvm_page.c > =================================================================== > RCS file: /cvs/src/sys/uvm/uvm_page.c,v > diff -u -p -r1.174 uvm_page.c > --- sys/uvm/uvm_page.c 13 Feb 2024 10:16:28 -0000 1.174 > +++ sys/uvm/uvm_page.c 31 Mar 2024 12:16:46 -0000 > @@ -75,6 +75,7 @@ > #include > > #include > +#include > > /* > * for object trees > @@ -120,6 +121,10 @@ static void uvm_pageinsert(struct vm_pag > static void uvm_pageremove(struct vm_page *); > int uvm_page_owner_locked_p(struct vm_page *); > > +struct vm_page *uvm_pmr_getone(int); > +struct vm_page *uvm_pmr_cache_get(int); > +void uvm_pmr_cache_put(struct vm_page *); > + > /* > * inline functions > */ > @@ -877,13 +882,11 @@ uvm_pagerealloc_multi(struct uvm_object > * => only one of obj or anon can be non-null > * => caller must activate/deactivate page if it is not wired. > */ > - > struct vm_page * > uvm_pagealloc(struct uvm_object *obj, voff_t off, struct vm_anon *anon, > int flags) > { > - struct vm_page *pg; > - struct pglist pgl; > + struct vm_page *pg = NULL; > int pmr_flags; > > KASSERT(obj == NULL || anon == NULL); > @@ -906,13 +909,10 @@ uvm_pagealloc(struct uvm_object *obj, vo > > if (flags & UVM_PGA_ZERO) > pmr_flags |= UVM_PLA_ZERO; > - TAILQ_INIT(&pgl); > - if (uvm_pmr_getpages(1, 0, 0, 1, 0, 1, pmr_flags, &pgl) != 0) > - goto fail; > - > - pg = TAILQ_FIRST(&pgl); > - KASSERT(pg != NULL && TAILQ_NEXT(pg, pageq) == NULL); > > + pg = uvm_pmr_cache_get(pmr_flags); > + if (pg == NULL) > + return NULL; > uvm_pagealloc_pg(pg, obj, off, anon); > KASSERT((pg->pg_flags & PG_DEV) == 0); > if (flags & UVM_PGA_ZERO) > @@ -921,9 +921,6 @@ uvm_pagealloc(struct uvm_object *obj, vo > atomic_setbits_int(&pg->pg_flags, PG_CLEAN); > > return pg; > - > -fail: > - return NULL; > } > > /* > @@ -1025,7 +1022,7 @@ void > uvm_pagefree(struct vm_page *pg) > { > uvm_pageclean(pg); > - uvm_pmr_freepages(pg, 1); > + uvm_pmr_cache_put(pg); > } > > /* > @@ -1398,3 +1395,153 @@ uvm_pagecount(struct uvm_constraint_rang > } > return sz; > } > + > +struct vm_page * > +uvm_pmr_getone(int flags) > +{ > + struct vm_page *pg; > + struct pglist pgl; > + > + TAILQ_INIT(&pgl); > + if (uvm_pmr_getpages(1, 0, 0, 1, 0, 1, flags, &pgl) != 0) > + return NULL; > + > + pg = TAILQ_FIRST(&pgl); > + KASSERT(pg != NULL && TAILQ_NEXT(pg, pageq) == NULL); > + > + return pg; > +} > + > +#if defined(MULTIPROCESSOR) && defined(__HAVE_UVM_PERCPU) > + > +/* > + * Reload a magazine. > + */ > +int > +uvm_pmr_cache_alloc(struct uvm_pmr_cache_item *upci) > +{ > + struct vm_page *pg; > + struct pglist pgl; > + int flags = UVM_PLA_NOWAIT|UVM_PLA_NOWAKE; > + int npages = UVM_PMR_CACHEMAGSZ; > + > + KASSERT(upci->upci_npages == 0); > + > + TAILQ_INIT(&pgl); > + if (uvm_pmr_getpages(npages, 0, 0, 1, 0, npages, flags, &pgl)) > + return -1; > + > + while ((pg = TAILQ_FIRST(&pgl)) != NULL) { > + TAILQ_REMOVE(&pgl, pg, pageq); > + upci->upci_pages[upci->upci_npages] = pg; > + upci->upci_npages++; > + } > + atomic_add_int(&uvmexp.percpucaches, npages); > + > + return 0; > +} > + > +struct vm_page * > +uvm_pmr_cache_get(int flags) > +{ > + struct uvm_pmr_cache *upc = &curcpu()->ci_uvm; > + struct uvm_pmr_cache_item *upci; > + struct vm_page *pg; > + > + upci = &upc->upc_magz[upc->upc_actv]; > + if (upci->upci_npages == 0) { > + unsigned int prev; > + > + prev = (upc->upc_actv == 0) ? 1 : 0; > + upci = &upc->upc_magz[prev]; > + if (upci->upci_npages == 0) { > + atomic_inc_int(&uvmexp.pcpmiss); > + if (uvm_pmr_cache_alloc(upci)) > + return uvm_pmr_getone(flags); > + } > + /* Swap magazines */ > + upc->upc_actv = prev; > + } else { > + atomic_inc_int(&uvmexp.pcphit); > + } > + > + atomic_dec_int(&uvmexp.percpucaches); > + upci->upci_npages--; > + pg = upci->upci_pages[upci->upci_npages]; > + > + if (flags & UVM_PLA_ZERO) > + uvm_pagezero(pg); > + > + return pg; > +} > + > +void > +uvm_pmr_cache_free(struct uvm_pmr_cache_item *upci) > +{ > + struct pglist pgl; > + int i; > + > + TAILQ_INIT(&pgl); > + for (i = 0; i < upci->upci_npages; i++) > + TAILQ_INSERT_TAIL(&pgl, upci->upci_pages[i], pageq); > + > + uvm_pmr_freepageq(&pgl); > + > + atomic_sub_int(&uvmexp.percpucaches, upci->upci_npages); > + upci->upci_npages = 0; > + memset(upci->upci_pages, 0, sizeof(upci->upci_pages)); > +} > + > +void > +uvm_pmr_cache_put(struct vm_page *pg) > +{ > + struct uvm_pmr_cache *upc = &curcpu()->ci_uvm; > + struct uvm_pmr_cache_item *upci; > + > + upci = &upc->upc_magz[upc->upc_actv]; > + if (upci->upci_npages >= UVM_PMR_CACHEMAGSZ) { > + unsigned int prev; > + > + prev = (upc->upc_actv == 0) ? 1 : 0; > + upci = &upc->upc_magz[prev]; > + if (upci->upci_npages > 0) > + uvm_pmr_cache_free(upci); > + > + /* Swap magazines */ > + upc->upc_actv = prev; > + KASSERT(upci->upci_npages == 0); > + } > + > + upci->upci_pages[upci->upci_npages] = pg; > + upci->upci_npages++; > + atomic_inc_int(&uvmexp.percpucaches); > +} > + > +void > +uvm_pmr_cache_drain(void) > +{ > + struct uvm_pmr_cache *upc = &curcpu()->ci_uvm; > + > + uvm_pmr_cache_free(&upc->upc_magz[0]); > + uvm_pmr_cache_free(&upc->upc_magz[1]); > +} > + > +#else /* !(MULTIPROCESSOR && __HAVE_UVM_PERCPU) */ > + > +struct vm_page * > +uvm_pmr_cache_get(int flags) > +{ > + return uvm_pmr_getone(flags); > +} > + > +void > +uvm_pmr_cache_put(struct vm_page *pg) > +{ > + uvm_pmr_freepages(pg, 1); > +} > + > +void > +uvm_pmr_cache_drain(void) > +{ > +} > +#endif /* MULTIPROCESSOR */ > Index: sys/uvm/uvm_pdaemon.c > =================================================================== > RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v > diff -u -p -r1.110 uvm_pdaemon.c > --- sys/uvm/uvm_pdaemon.c 24 Mar 2024 10:29:35 -0000 1.110 > +++ sys/uvm/uvm_pdaemon.c 30 Mar 2024 12:53:39 -0000 > @@ -80,6 +80,7 @@ > #endif > > #include > +#include > > #include "drm.h" > > @@ -262,6 +263,8 @@ uvm_pageout(void *arg) > #if NDRM > 0 > drmbackoff(size * 2); > #endif > + uvm_pmr_cache_drain(); > + > /* > * scan if needed > */ > Index: sys/uvm/uvm_percpu.h > =================================================================== > RCS file: sys/uvm/uvm_percpu.h > diff -N sys/uvm/uvm_percpu.h > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ sys/uvm/uvm_percpu.h 30 Mar 2024 12:54:47 -0000 > @@ -0,0 +1,45 @@ > +/* $OpenBSD$ */ > + > +/* > + * Copyright (c) 2024 Martin Pieuchot > + * > + * Permission to use, copy, modify, and distribute this software for any > + * purpose with or without fee is hereby granted, provided that the above > + * copyright notice and this permission notice appear in all copies. > + * > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > + */ > + > +#ifndef _UVM_UVM_PCPU_H_ > +#define _UVM_UVM_PCPU_H_ > + > +/* > + * We want a per-CPU cache size to be as small as possible and at the > + * same time gets rid of the `uvm_lock_fpageq' contention. > + */ > +#define UVM_PMR_CACHEMAGSZ 8 /* # of pages in a magazine */ > + > +struct vm_page; > + > +/* Magazine */ > +struct uvm_pmr_cache_item { > + struct vm_page *upci_pages[UVM_PMR_CACHEMAGSZ]; > + int upci_npages; /* # of pages in magazine */ > +}; > + > +/* Per-CPU cache */ > +struct uvm_pmr_cache { > + struct uvm_pmr_cache_item upc_magz[2]; /* magazines */ > + int upc_actv; /* index of active magazine */ > + > +}; > + > +void uvm_pmr_cache_drain(void); > + > +#endif /* _UVM_UVM_PCPU_H_ */ > Index: sys/uvm/uvmexp.h > =================================================================== > RCS file: /cvs/src/sys/uvm/uvmexp.h,v > diff -u -p -r1.12 uvmexp.h > --- sys/uvm/uvmexp.h 24 Mar 2024 10:29:35 -0000 1.12 > +++ sys/uvm/uvmexp.h 29 Mar 2024 21:04:16 -0000 > @@ -66,7 +66,7 @@ struct uvmexp { > int zeropages; /* [F] number of zero'd pages */ > int reserve_pagedaemon; /* [I] # of pages reserved for pagedaemon */ > int reserve_kernel; /* [I] # of pages reserved for kernel */ > - int unused01; /* formerly anonpages */ > + int percpucaches; /* [a] # of pages in per-CPU caches */ > int vnodepages; /* XXX # of pages used by vnode page cache */ > int vtextpages; /* XXX # of pages used by vtext vnodes */ > > @@ -101,8 +101,8 @@ struct uvmexp { > int syscalls; /* system calls */ > int pageins; /* [p] pagein operation count */ > /* pageouts are in pdpageouts below */ > - int unused07; /* formerly obsolete_swapins */ > - int unused08; /* formerly obsolete_swapouts */ > + int pcphit; /* [a] # of pagealloc from per-CPU cache */ > + int pcpmiss; /* [a] # of times a per-CPU cache was empty */ > int pgswapin; /* pages swapped in */ > int pgswapout; /* pages swapped out */ > int forks; /* forks */ > Index: sys/arch/amd64/include/cpu.h > =================================================================== > RCS file: /cvs/src/sys/arch/amd64/include/cpu.h,v > diff -u -p -r1.163 cpu.h > --- sys/arch/amd64/include/cpu.h 25 Feb 2024 19:15:50 -0000 1.163 > +++ sys/arch/amd64/include/cpu.h 30 Mar 2024 12:55:27 -0000 > @@ -53,6 +53,7 @@ > #include > #include > #include > +#include > > #ifdef _KERNEL > > @@ -201,6 +202,8 @@ struct cpu_info { > > #ifdef MULTIPROCESSOR > struct srp_hazard ci_srp_hazards[SRP_HAZARD_NUM]; > +#define __HAVE_UVM_PERCPU > + struct uvm_pmr_cache ci_uvm; /* [o] page cache */ > #endif > > struct ksensordev ci_sensordev; > Index: sys/arch/arm64/include/cpu.h > =================================================================== > RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v > diff -u -p -r1.43 cpu.h > --- sys/arch/arm64/include/cpu.h 25 Feb 2024 19:15:50 -0000 1.43 > +++ sys/arch/arm64/include/cpu.h 30 Mar 2024 12:55:55 -0000 > @@ -108,6 +108,7 @@ void arm32_vector_init(vaddr_t, int); > #include > #include > #include > +#include > > struct cpu_info { > struct device *ci_dev; /* Device corresponding to this CPU */ > @@ -161,6 +162,8 @@ struct cpu_info { > > #ifdef MULTIPROCESSOR > struct srp_hazard ci_srp_hazards[SRP_HAZARD_NUM]; > +#define __HAVE_UVM_PERCPU > + struct uvm_pmr_cache ci_uvm; > volatile int ci_flags; > > volatile int ci_ddb_paused;