From: David Gwynne Subject: sparc64 bus_dmamap_load_mbuf() improvements To: tech@openbsd.org Date: Sat, 7 Sep 2024 10:44:57 +1000 i've been kicking this around a bit with ix(4) and jumbos. ix(4) uses the intel mbuf cluster allocator for rx descriptors, which are basically 2k buffers that are offset slightly. this means some of these buffers straddle pages, which hits a couple of edge conditions in the sparc64 bus_dmamap_load_mbuf code. the first edge condition is where the code fails to merge memory on contiguous physical pages into a single segment unless it's less than the maximum segment size. if ix(4) is configured so the max seg size for rx descriptors is 2k, and that 2k straddles pages, then _bus_dmamap_load_mbuf should merge them, but current doesn't. the second tweak is around how the code splits segments around the maximum segment size. at the moment the code won't merge contig pages into a single segment if the new segment will be larger than the segment size. this diff makes it so it merges and then splits. if you have a dmamap that supports a maximum of 2 2k segments, and you give it a 4k buffer that straddles pages, the current code won't cope with it because it will want to use 3 segments cos of how the memory is laid out. by merging the pages and then splitting, it should result in a valid dmamap. i think we've gotten away without these tweaks because a lot of our drivers create rx dmamaps that overprovision the mappings they support, particularly the ones commonly used on sparc64. ok? Index: machdep.c =================================================================== RCS file: /cvs/src/sys/arch/sparc64/sparc64/machdep.c,v diff -u -p -r1.218 machdep.c --- machdep.c 22 May 2024 05:51:49 -0000 1.218 +++ machdep.c 7 Sep 2024 00:36:21 -0000 @@ -991,12 +991,20 @@ _bus_dmamap_load_mbuf(bus_dma_tag_t t, b buflen -= incr; vaddr += incr; - if (i > 0 && pa == (segs[i - 1].ds_addr + - segs[i - 1].ds_len) && ((segs[i - 1].ds_len + incr) - < map->_dm_maxsegsz)) { - /* Hey, waddyaknow, they're contiguous */ - segs[i - 1].ds_len += incr; - continue; + if (i > 0) { + bus_dma_segment_t *pseg = &segs[i - 1]; + if (pa == pseg->ds_addr + pseg->ds_len) { + /* waddyaknow, they're contiguous */ + long nlen = pseg->ds_len + incr; + if (nlen <= map->_dm_maxsegsz) { + pseg->ds_len = nlen; + continue; + } + pseg->ds_len = map->_dm_maxsegsz; + + pa = pseg->ds_addr + map->_dm_maxsegsz; + incr = nlen - map->_dm_maxsegsz; + } } segs[i].ds_addr = pa; segs[i].ds_len = incr;