Index | Thread | Search

From:
Jeremie Courreges-Anglas <jca@wxcvbn.org>
Subject:
Re: tar(1) write format selection
To:
Caspar Schutijser <caspar@schutijser.com>
Cc:
tech@openbsd.org, millert@openbsd.org
Date:
Tue, 16 Apr 2024 16:48:30 +0200

Download raw body.

Thread
On Tue, Apr 16, 2024 at 03:27:54PM +0200, Caspar Schutijser:
> On Mon, Apr 15, 2024 at 07:48:22PM +0200, Jeremie Courreges-Anglas wrote:
[...]
> > @@ -1042,7 +1061,7 @@ tar_options(int argc, char **argv)
> >  		break;
> >  	case ARCHIVE:
> >  	case APPND:
> > -		frmt = &(fsub[Oflag ? F_OTAR : F_TAR]);
> > +		frmt = &fsub[format];
> 
> This chunk made me look the feature to append files to archives.
> There is a problem here:
> 
> $ tar cf /tmp/test-1.tar tar.1
> $ tar rf /tmp/test-1.tar pax.1
> $ ./obj/tar cf /tmp/test-1.tar tar.1
> $ ./obj/tar rf /tmp/test-1.tar pax.1 
> tar: Cannot mix current archive format ustar with pax
> 
> If I search for "mix", I find code in ar_subs.c
> 
> I think the problem is that get_arc() can not identify the new pax
> file format correctly, and instead thinks the file format is ustar.
> 
> So we should fix that first, otherwise appending files to archives
> will break.

Right, I missed that in my testing, thanks.  We should fix this
before changing the default.

> I'll see if I can find out how to do that but I'm also still looking
> at other parts of the diff.

Diff for that below.  Since ustar_id() matches both on pax and ustar
archives, we need a more specialized function.  pax format should also
come first in ford[].

Note: AFAIK it is legal to have a pax archive that doesn't start with
a global or extended header, in that case we will fallback to ustar
for appending.

ok?


Index: extern.h
===================================================================
--- extern.h.orig	2024-04-16 12:01:41.499000123 +0100
+++ extern.h	2024-04-16 15:00:20.589159225 +0100
@@ -284,6 +284,7 @@
 int ustar_id(char *, int);
 int ustar_rd(ARCHD *, char *);
 int ustar_wr(ARCHD *);
+int pax_id(char *, int);
 int pax_wr(ARCHD *);
 
 /*
Index: options.c
===================================================================
--- options.c.orig	2024-04-16 14:49:59.569096551 +0100
+++ options.c	2024-04-16 15:02:20.349162527 +0100
@@ -228,7 +228,7 @@
 /* 9: gzip, to detect failure to use -z */
 	{NULL, 0, 4, 0, 0, 0, 0, gzip_id},
 /* 10: POSIX PAX */
-	{"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, ustar_id, no_op,
+	{"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, pax_id, no_op,
 	ustar_rd, tar_endrd, no_op, pax_wr, tar_endwr, tar_trail,
 	tar_opt},
 #endif
@@ -249,7 +249,7 @@
  * of archive we are dealing with. This helps to properly id archive formats
  * some formats may be subsets of others....
  */
-int ford[] = {5, 4, 9, 8, 7, 6, 3, 2, 1, 0, -1};
+int ford[] = {10, 5, 4, 9, 8, 7, 6, 3, 2, 1, 0, -1};
 
 /*
  * Do we have -C anywhere and what is it?
Index: tar.c
===================================================================
--- tar.c.orig	2024-04-16 12:01:41.499000123 +0100
+++ tar.c	2024-04-16 15:06:35.199093744 +0100
@@ -1392,6 +1392,46 @@
 }
 
 /*
+ * pax_id()
+ *	determine if a block given to us is a valid pax header.
+ * Return:
+ *	0 if a pax header, -1 otherwise
+ */
+#ifndef SMALL
+int
+pax_id(char *blk, int size)
+{
+	HD_USTAR *hd;
+
+	if (size < BLKMULT)
+		return(-1);
+	hd = (HD_USTAR *)blk;
+
+	/*
+	 * check for block of zero's first, a simple and fast test then check
+	 * ustar magic cookie. We should use TMAGLEN, but some USTAR archive
+	 * programs are fouled up and create archives missing the \0. Last we
+	 * check the checksum and the type flag. If ok we have to assume it is
+	 * a valid pax header.
+	 */
+	if (hd->prefix[0] == '\0' && hd->name[0] == '\0')
+		return(-1);
+	if (strncmp(hd->magic, TMAGIC, TMAGLEN - 1) != 0)
+		return(-1);
+	if (asc_ul(hd->chksum,sizeof(hd->chksum),OCT) != tar_chksm(blk,BLKMULT))
+		return(-1);
+	/*
+	 * It is valid for a pax formatted archive not to start with
+	 * a global header nor with an extended header. In that case
+	 * we'll fall back to ustar in append mode.
+	 */
+	if (hd->typeflag == XHDRTYPE || hd->typeflag == GHDRTYPE)
+		return(0);
+	return (-1);
+}
+#endif
+
+/*
  * pax_wr()
  *	Write out a pax format archive.
  *	Have to check for file types that cannot be stored.  Be careful of the


-- 
jca