From: Bob Beck <beck@obtuse.com>
Subject: Re: Prevent recursion in bufbackoff() by discarding DMA pages
To: Mark Kettenis <mark.kettenis@xs4all.nl>
Cc: Bob Beck <beck@obtuse.com>, Martin Pieuchot <mpi@grenadille.net>, tech@openbsd.org, beck@openbsd.org
Date: Thu, 3 Oct 2024 09:58:08 -0600

This diff makes a lot more sense..

I have no issue with not waiting for memory, we just shouldn’t. But always throwing it away is kind of silly, (unless we concertedly move to a model where we move away from two regions conscientiously)

As it sits today, I really don’t want “just toss it away” to go in as a hack because we don’t want to deal with the UVM functions, then have someone figure out later they have cache performance issues that make no sense and start complaining. 

The *intention* of a backoff case is to *not* wait to allocate memory to flip, but flip if it *is* available. Which if we want to put the onus on the caller to decide that as mark is suggesting seems fine to me. 

Of course will need testing. 

> On Oct 1, 2024, at 3:52 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote:
> 
>> Date: Mon, 30 Sep 2024 10:00:55 +0200
>> From: Martin Pieuchot <mpi@grenadille.net>
>> 
>> Currently when bufbackoff(), called by the page daemon, ends up reducing
>> the DMA cache it tries to flip every buffer.  Flipping buffers implies
>> allocating new pages which is something we want to avoid in such
>> situation.
> 
> Well, if there is plenty of "high" memory, allocating pages should not
> be a problem.
> 
>> On top of that, once uvm_pagerealloc_multi() failed, bufbackoff() is
>> called again, creating a recursion.
> 
> Yes, that is bad.
> 
>> To prevent these scenarios and speed-up the page daemon I'd like to
>> commit the diff below which always discard DMA buffers when they are
>> being freed from bufbackoff().
>> 
>> The downside of this approach is that we now discard buffer if only
>> DMA-reachable memory is missing.
>> 
>> ok?
> 
> I wonder if we should do something like the diff below instead.  This
> removes the bufbackoff() call from buf_realloc_pages() and gives full
> control of the UVM_PLA_* flags to the callers.
> 
> This makes buf_flip_high() entirely opportunistic.  And buf_flip_dma()
> can still make the buffer cache back off by waking up the pagedeamon.
> 
> Simplifying the code like this fells better than adding more plaster.
> Or am I missing something?  I didn't really test this under memory
> pressure yet...
> 
> P.S. I wonder if we should simplify buf_alloc_pages() in the same way.
> 
> 
> Index: kern/vfs_bio.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/vfs_bio.c,v
> diff -u -p -r1.213 vfs_bio.c
> --- kern/vfs_bio.c 3 Feb 2024 18:51:58 -0000 1.213
> +++ kern/vfs_bio.c 1 Oct 2024 09:34:55 -0000
> @@ -382,7 +382,8 @@ buf_flip_high(struct buf *bp)
> splassert(IPL_BIO);
> 
> /* Attempt to move the buffer to high memory if we can */
> - if (buf_realloc_pages(bp, &high_constraint, UVM_PLA_NOWAIT) == 0) {
> + if (buf_realloc_pages(bp, &high_constraint,
> +    UVM_PLA_NOWAIT | UVM_PLA_NOWAKE) == 0) {
> KASSERT(!ISSET(bp->b_flags, B_DMA));
> bcstats.highflips++;
> ret = 0;
> Index: kern/vfs_biomem.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/vfs_biomem.c,v
> diff -u -p -r1.51 vfs_biomem.c
> --- kern/vfs_biomem.c 24 Oct 2021 00:02:25 -0000 1.51
> +++ kern/vfs_biomem.c 1 Oct 2024 09:34:55 -0000
> @@ -340,20 +340,8 @@ buf_realloc_pages(struct buf *bp, struct
> pmap_update(pmap_kernel());
> }
> 
> - do {
> - r = uvm_pagerealloc_multi(bp->b_pobj, bp->b_poffs,
> -    bp->b_bufsize, UVM_PLA_NOWAIT | UVM_PLA_NOWAKE, where);
> - if (r == 0)
> - break;
> - } while ((bufbackoff(where, atop(bp->b_bufsize)) == 0));
> -
> - /*
> - * bufbackoff() failed, so there's no more we can do without
> - * waiting.  If allowed do, make that attempt.
> - */
> - if (r != 0 && (flags & UVM_PLA_WAITOK))
> - r = uvm_pagerealloc_multi(bp->b_pobj, bp->b_poffs,
> -    bp->b_bufsize, flags, where);
> + r = uvm_pagerealloc_multi(bp->b_pobj, bp->b_poffs,
> +    bp->b_bufsize, flags, where);
> 
> /*
> * If the allocation has succeeded, we may be somewhere different.