Index | Thread | Search

From:
Caspar Schutijser <caspar@schutijser.com>
Subject:
Re: tar(1) write format selection
To:
Jeremie Courreges-Anglas <jca@wxcvbn.org>
Cc:
tech@openbsd.org, millert@openbsd.org
Date:
Tue, 16 Apr 2024 17:10:24 +0200

Download raw body.

Thread
On Tue, Apr 16, 2024 at 04:48:30PM +0200, Jeremie Courreges-Anglas wrote:
> On Tue, Apr 16, 2024 at 03:27:54PM +0200, Caspar Schutijser:
> > On Mon, Apr 15, 2024 at 07:48:22PM +0200, Jeremie Courreges-Anglas wrote:
> [...]
> > > @@ -1042,7 +1061,7 @@ tar_options(int argc, char **argv)
> > >  		break;
> > >  	case ARCHIVE:
> > >  	case APPND:
> > > -		frmt = &(fsub[Oflag ? F_OTAR : F_TAR]);
> > > +		frmt = &fsub[format];
> > 
> > This chunk made me look the feature to append files to archives.
> > There is a problem here:
> > 
> > $ tar cf /tmp/test-1.tar tar.1
> > $ tar rf /tmp/test-1.tar pax.1
> > $ ./obj/tar cf /tmp/test-1.tar tar.1
> > $ ./obj/tar rf /tmp/test-1.tar pax.1 
> > tar: Cannot mix current archive format ustar with pax
> > 
> > If I search for "mix", I find code in ar_subs.c
> > 
> > I think the problem is that get_arc() can not identify the new pax
> > file format correctly, and instead thinks the file format is ustar.
> > 
> > So we should fix that first, otherwise appending files to archives
> > will break.
> 
> Right, I missed that in my testing, thanks.  We should fix this
> before changing the default.
> 
> > I'll see if I can find out how to do that but I'm also still looking
> > at other parts of the diff.
> 
> Diff for that below.  Since ustar_id() matches both on pax and ustar
> archives, we need a more specialized function.  pax format should also
> come first in ford[].
> 
> Note: AFAIK it is legal to have a pax archive that doesn't start with
> a global or extended header, in that case we will fallback to ustar
> for appending.
> 
> ok?

Thanks, that looks good to me and works well in my tests.

OK caspar@, but let's also give millert@ an opportunity to respond.

Caspar

> 
> 
> Index: extern.h
> ===================================================================
> --- extern.h.orig	2024-04-16 12:01:41.499000123 +0100
> +++ extern.h	2024-04-16 15:00:20.589159225 +0100
> @@ -284,6 +284,7 @@
>  int ustar_id(char *, int);
>  int ustar_rd(ARCHD *, char *);
>  int ustar_wr(ARCHD *);
> +int pax_id(char *, int);
>  int pax_wr(ARCHD *);
>  
>  /*
> Index: options.c
> ===================================================================
> --- options.c.orig	2024-04-16 14:49:59.569096551 +0100
> +++ options.c	2024-04-16 15:02:20.349162527 +0100
> @@ -228,7 +228,7 @@
>  /* 9: gzip, to detect failure to use -z */
>  	{NULL, 0, 4, 0, 0, 0, 0, gzip_id},
>  /* 10: POSIX PAX */
> -	{"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, ustar_id, no_op,
> +	{"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, pax_id, no_op,
>  	ustar_rd, tar_endrd, no_op, pax_wr, tar_endwr, tar_trail,
>  	tar_opt},
>  #endif
> @@ -249,7 +249,7 @@
>   * of archive we are dealing with. This helps to properly id archive formats
>   * some formats may be subsets of others....
>   */
> -int ford[] = {5, 4, 9, 8, 7, 6, 3, 2, 1, 0, -1};
> +int ford[] = {10, 5, 4, 9, 8, 7, 6, 3, 2, 1, 0, -1};
>  
>  /*
>   * Do we have -C anywhere and what is it?
> Index: tar.c
> ===================================================================
> --- tar.c.orig	2024-04-16 12:01:41.499000123 +0100
> +++ tar.c	2024-04-16 15:06:35.199093744 +0100
> @@ -1392,6 +1392,46 @@
>  }
>  
>  /*
> + * pax_id()
> + *	determine if a block given to us is a valid pax header.
> + * Return:
> + *	0 if a pax header, -1 otherwise
> + */
> +#ifndef SMALL
> +int
> +pax_id(char *blk, int size)
> +{
> +	HD_USTAR *hd;
> +
> +	if (size < BLKMULT)
> +		return(-1);
> +	hd = (HD_USTAR *)blk;
> +
> +	/*
> +	 * check for block of zero's first, a simple and fast test then check
> +	 * ustar magic cookie. We should use TMAGLEN, but some USTAR archive
> +	 * programs are fouled up and create archives missing the \0. Last we
> +	 * check the checksum and the type flag. If ok we have to assume it is
> +	 * a valid pax header.
> +	 */
> +	if (hd->prefix[0] == '\0' && hd->name[0] == '\0')
> +		return(-1);
> +	if (strncmp(hd->magic, TMAGIC, TMAGLEN - 1) != 0)
> +		return(-1);
> +	if (asc_ul(hd->chksum,sizeof(hd->chksum),OCT) != tar_chksm(blk,BLKMULT))
> +		return(-1);
> +	/*
> +	 * It is valid for a pax formatted archive not to start with
> +	 * a global header nor with an extended header. In that case
> +	 * we'll fall back to ustar in append mode.
> +	 */
> +	if (hd->typeflag == XHDRTYPE || hd->typeflag == GHDRTYPE)
> +		return(0);
> +	return (-1);
> +}
> +#endif
> +
> +/*
>   * pax_wr()
>   *	Write out a pax format archive.
>   *	Have to check for file types that cannot be stored.  Be careful of the
> 
> 
> -- 
> jca