Index | Thread | Search

From:
Marc Espie <marc.espie.openbsd@gmail.com>
Subject:
Re: patch: tsort has no business handling NUL bytes
To:
tech@openbsd.org
Date:
Wed, 24 Jun 2026 13:34:22 +0200

Download raw body.

Thread
On Wed, Jun 24, 2026 at 01:26:25PM +0200, Jeremie Courreges-Anglas wrote:
> On Wed, Jun 24, 2026 at 01:18:48PM +0200, Marc Espie wrote:
> > The Opengroup text says:
> > file
> >     A pathname of a text file to order. If no file operand is given, the standard input shall be used.
> > 
> > Keyword: text
> > 
> > a NUL byte has no place in a text file.
> 
> I agree that we shouldn't attempt to support NUL bytes.  Pretending
> that we do so in one place bears the question whether the rest of the
> code is safe, and it most probably isn't.  Actually, read_hints()
> right below has the same "issue".
> 
> Here's the diff which I hadn't sent yet because it didn't go through a
> bulk build.  Sooon...  I didn't bother printing the file name because
> of stdin handling.
> 
> 
> Index: tsort.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/tsort/tsort.c,v
> diff -u -p -r1.39 tsort.c
> --- tsort.c	23 Jun 2026 08:27:37 -0000	1.39
> +++ tsort.c	23 Jun 2026 09:54:53 -0000
> @@ -313,18 +313,20 @@ read_pairs(FILE *f, struct ohash *h, int
>  	while ((str = fgetln(f, &size)) != NULL) {
>  		char *sentinel;
>  
> +		if (memchr(str, '\0', size))
> +			errx(1, "NUL byte detected in input");
> +
>  		sentinel = str + size;
>  		for (;;) {
>  			char *e;
>  
>  			while (str < sentinel &&
> -			    (isspace((unsigned char)*str) || *str == '\0'))
> +			    isspace((unsigned char)*str))
>  				str++;
>  			if (str == sentinel)
>  				break;
>  			for (e = str;
> -			    e < sentinel && !isspace((unsigned char)*e) &&
> -			    *e != '\0'; e++)
> +			    e < sentinel && !isspace((unsigned char)*e); e++)
>  				continue;
>  			if (toggle) {
>  				a = node_lookup(h, str, e);
> @@ -362,6 +364,9 @@ read_hints(FILE *f, struct ohash *h, int
>  
>  	while ((str = fgetln(f, &size)) != NULL) {
>  		char *sentinel;
> +
> +		if (memchr(str, '\0', size))
> +			errx(1, "NUL byte detected in input");
>  
>  		sentinel = str + size;
>  		for (;;) {
> 
> 
> -- 
> jca
More efficient than my patch. Just one small tweak: those functions take
the file name, so the error message can mention it explicitly. Please do so
like for other errx messages in there.