From: Jeremie Courreges-Anglas Subject: Re: tar(1) write format selection To: Caspar Schutijser Cc: tech@openbsd.org, millert@openbsd.org Date: Tue, 16 Apr 2024 16:48:30 +0200 On Tue, Apr 16, 2024 at 03:27:54PM +0200, Caspar Schutijser: > On Mon, Apr 15, 2024 at 07:48:22PM +0200, Jeremie Courreges-Anglas wrote: [...] > > @@ -1042,7 +1061,7 @@ tar_options(int argc, char **argv) > > break; > > case ARCHIVE: > > case APPND: > > - frmt = &(fsub[Oflag ? F_OTAR : F_TAR]); > > + frmt = &fsub[format]; > > This chunk made me look the feature to append files to archives. > There is a problem here: > > $ tar cf /tmp/test-1.tar tar.1 > $ tar rf /tmp/test-1.tar pax.1 > $ ./obj/tar cf /tmp/test-1.tar tar.1 > $ ./obj/tar rf /tmp/test-1.tar pax.1 > tar: Cannot mix current archive format ustar with pax > > If I search for "mix", I find code in ar_subs.c > > I think the problem is that get_arc() can not identify the new pax > file format correctly, and instead thinks the file format is ustar. > > So we should fix that first, otherwise appending files to archives > will break. Right, I missed that in my testing, thanks. We should fix this before changing the default. > I'll see if I can find out how to do that but I'm also still looking > at other parts of the diff. Diff for that below. Since ustar_id() matches both on pax and ustar archives, we need a more specialized function. pax format should also come first in ford[]. Note: AFAIK it is legal to have a pax archive that doesn't start with a global or extended header, in that case we will fallback to ustar for appending. ok? Index: extern.h =================================================================== --- extern.h.orig 2024-04-16 12:01:41.499000123 +0100 +++ extern.h 2024-04-16 15:00:20.589159225 +0100 @@ -284,6 +284,7 @@ int ustar_id(char *, int); int ustar_rd(ARCHD *, char *); int ustar_wr(ARCHD *); +int pax_id(char *, int); int pax_wr(ARCHD *); /* Index: options.c =================================================================== --- options.c.orig 2024-04-16 14:49:59.569096551 +0100 +++ options.c 2024-04-16 15:02:20.349162527 +0100 @@ -228,7 +228,7 @@ /* 9: gzip, to detect failure to use -z */ {NULL, 0, 4, 0, 0, 0, 0, gzip_id}, /* 10: POSIX PAX */ - {"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, ustar_id, no_op, + {"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, pax_id, no_op, ustar_rd, tar_endrd, no_op, pax_wr, tar_endwr, tar_trail, tar_opt}, #endif @@ -249,7 +249,7 @@ * of archive we are dealing with. This helps to properly id archive formats * some formats may be subsets of others.... */ -int ford[] = {5, 4, 9, 8, 7, 6, 3, 2, 1, 0, -1}; +int ford[] = {10, 5, 4, 9, 8, 7, 6, 3, 2, 1, 0, -1}; /* * Do we have -C anywhere and what is it? Index: tar.c =================================================================== --- tar.c.orig 2024-04-16 12:01:41.499000123 +0100 +++ tar.c 2024-04-16 15:06:35.199093744 +0100 @@ -1392,6 +1392,46 @@ } /* + * pax_id() + * determine if a block given to us is a valid pax header. + * Return: + * 0 if a pax header, -1 otherwise + */ +#ifndef SMALL +int +pax_id(char *blk, int size) +{ + HD_USTAR *hd; + + if (size < BLKMULT) + return(-1); + hd = (HD_USTAR *)blk; + + /* + * check for block of zero's first, a simple and fast test then check + * ustar magic cookie. We should use TMAGLEN, but some USTAR archive + * programs are fouled up and create archives missing the \0. Last we + * check the checksum and the type flag. If ok we have to assume it is + * a valid pax header. + */ + if (hd->prefix[0] == '\0' && hd->name[0] == '\0') + return(-1); + if (strncmp(hd->magic, TMAGIC, TMAGLEN - 1) != 0) + return(-1); + if (asc_ul(hd->chksum,sizeof(hd->chksum),OCT) != tar_chksm(blk,BLKMULT)) + return(-1); + /* + * It is valid for a pax formatted archive not to start with + * a global header nor with an extended header. In that case + * we'll fall back to ustar in append mode. + */ + if (hd->typeflag == XHDRTYPE || hd->typeflag == GHDRTYPE) + return(0); + return (-1); +} +#endif + +/* * pax_wr() * Write out a pax format archive. * Have to check for file types that cannot be stored. Be careful of the -- jca