Index | Thread | Search

From:
Miroslav Cimerman <mc@doas.su>
Subject:
softraid: dangerous assembly, fix chunk sorting
To:
"tech@openbsd.org" <tech@openbsd.org>
Cc:
Miroslav Cimerman <mc@doas.su>
Date:
Mon, 30 Jun 2025 15:49:21 +0000

Download raw body.

Thread
  • Miroslav Cimerman:

    softraid: dangerous assembly, fix chunk sorting

Dear tech@,

It is possible for Offline chunks to skip IDs. Therefore
it can be dangerous for striped volumes to be partially
assembled.

What I mean by skipping IDs, here is a RAID 5 example:
Volume      Status               Size Device
softraid0 0 Online          104202240 sd0     RAID5
          0 Online           52125696 0:0.0   noencl <wd1a>
          1 Online           52125696 0:1.0   noencl <wd2a>
          2 Online           52125696 0:2.0   noencl <wd3a>

After rebooting without the last chunk (wd3), the volume becomes:
Volume      Status               Size Device
softraid0 0 Degraded        104202240 sd0     RAID5
          0 Online           52125696 0:0.0   noencl <wd1a>
          1 Offline                 0 0:1.0   noencl <>
          2 Online           52125696 0:2.0   noencl <wd2a> 

The originally last chunk which is Offline now has chunk_id = 1,
and originally 2nd chunk (wd2) is now the last. This for example
means, that the data and parity chunks are effectively swapped in
the first stripe...

This happens because the Offline chunk does not have any fields
loaded, and with everything being zero including scm_chunk_id,
it puts itself after the first chunk.

Fix: just make all Offline chunks fall into non-occupied slots.
Diff below.

After diff, new volume, same scenario:
Volume      Status               Size Device
softraid0 0 Degraded        104202240 sd0     RAID5
          0 Online           52125696 0:0.0   noencl <wd1a>
          1 Online           52125696 0:1.0   noencl <wd2a>
          2 Offline                 0 0:2.0   noencl <>


However, old volumes that had chunk ids mangled must be
repaired manually.

--
Miroslav Cimerman


Index: dev/softraid.c
===================================================================
RCS file: /cvs/src/sys/dev/softraid.c,v
diff -u -p -u -r1.433 softraid.c
--- dev/softraid.c	8 Jan 2025 23:40:40 -0000	1.433
+++ dev/softraid.c	30 Jun 2025 14:49:05 -0000
@@ -270,7 +270,8 @@ sr_meta_attach(struct sr_discipline *sd,
 		chunk2 = NULL;
 		SLIST_FOREACH(chunk1, cl, src_link) {
 			if (chunk1->src_meta.scmi.scm_chunk_id >
-			    ch_entry->src_meta.scmi.scm_chunk_id)
+			    ch_entry->src_meta.scmi.scm_chunk_id &&
+			    ch_entry->src_meta.scm_status != BIOC_SDOFFLINE)
 				break;
 			chunk2 = chunk1;
 		}