Index | Thread | Search

From:
David Gwynne <david@gwynne.id.au>
Subject:
sparc64 bus_dmamap_load_mbuf() improvements
To:
tech@openbsd.org
Date:
Sat, 7 Sep 2024 10:44:57 +1000

Download raw body.

Thread
  • David Gwynne:

    sparc64 bus_dmamap_load_mbuf() improvements

i've been kicking this around a bit with ix(4) and jumbos.

ix(4) uses the intel mbuf cluster allocator for rx descriptors, which
are basically 2k buffers that are offset slightly. this means some of
these buffers straddle pages, which hits a couple of edge conditions
in the sparc64 bus_dmamap_load_mbuf code.

the first edge condition is where the code fails to merge memory on
contiguous physical pages into a single segment unless it's less than
the maximum segment size. if ix(4) is configured so the max seg
size for rx descriptors is 2k, and that 2k straddles pages, then
_bus_dmamap_load_mbuf should merge them, but current doesn't.

the second tweak is around how the code splits segments around the
maximum segment size. at the moment the code won't merge contig pages
into a single segment if the new segment will be larger than the segment
size. this diff makes it so it merges and then splits.

if you have a dmamap that supports a maximum of 2 2k segments, and
you give it a 4k buffer that straddles pages, the current code won't
cope with it because it will want to use 3 segments cos of how the
memory is laid out. by merging the pages and then splitting, it
should result in a valid dmamap.

i think we've gotten away without these tweaks because a lot of our
drivers create rx dmamaps that overprovision the mappings they support,
particularly the ones commonly used on sparc64.

ok?

Index: machdep.c
===================================================================
RCS file: /cvs/src/sys/arch/sparc64/sparc64/machdep.c,v
diff -u -p -r1.218 machdep.c
--- machdep.c	22 May 2024 05:51:49 -0000	1.218
+++ machdep.c	7 Sep 2024 00:36:21 -0000
@@ -991,12 +991,20 @@ _bus_dmamap_load_mbuf(bus_dma_tag_t t, b
 			buflen -= incr;
 			vaddr += incr;
 
-			if (i > 0 && pa == (segs[i - 1].ds_addr +
-			    segs[i - 1].ds_len) && ((segs[i - 1].ds_len + incr)
-			    < map->_dm_maxsegsz)) {
-				/* Hey, waddyaknow, they're contiguous */
-				segs[i - 1].ds_len += incr;
-				continue;
+			if (i > 0) {
+				bus_dma_segment_t *pseg = &segs[i - 1];
+				if (pa == pseg->ds_addr + pseg->ds_len) {
+					/* waddyaknow, they're contiguous */
+					long nlen = pseg->ds_len + incr;
+					if (nlen <= map->_dm_maxsegsz) {
+						pseg->ds_len = nlen;
+						continue;
+					}
+					pseg->ds_len = map->_dm_maxsegsz;
+
+					pa = pseg->ds_addr + map->_dm_maxsegsz;
+					incr = nlen - map->_dm_maxsegsz;
+				}
 			}
 			segs[i].ds_addr = pa;
 			segs[i].ds_len = incr;