From: Marc Espie Subject: Re: patch: tsort has no business handling NUL bytes To: tech@openbsd.org Date: Wed, 24 Jun 2026 13:34:22 +0200 On Wed, Jun 24, 2026 at 01:26:25PM +0200, Jeremie Courreges-Anglas wrote: > On Wed, Jun 24, 2026 at 01:18:48PM +0200, Marc Espie wrote: > > The Opengroup text says: > > file > > A pathname of a text file to order. If no file operand is given, the standard input shall be used. > > > > Keyword: text > > > > a NUL byte has no place in a text file. > > I agree that we shouldn't attempt to support NUL bytes. Pretending > that we do so in one place bears the question whether the rest of the > code is safe, and it most probably isn't. Actually, read_hints() > right below has the same "issue". > > Here's the diff which I hadn't sent yet because it didn't go through a > bulk build. Sooon... I didn't bother printing the file name because > of stdin handling. > > > Index: tsort.c > =================================================================== > RCS file: /cvs/src/usr.bin/tsort/tsort.c,v > diff -u -p -r1.39 tsort.c > --- tsort.c 23 Jun 2026 08:27:37 -0000 1.39 > +++ tsort.c 23 Jun 2026 09:54:53 -0000 > @@ -313,18 +313,20 @@ read_pairs(FILE *f, struct ohash *h, int > while ((str = fgetln(f, &size)) != NULL) { > char *sentinel; > > + if (memchr(str, '\0', size)) > + errx(1, "NUL byte detected in input"); > + > sentinel = str + size; > for (;;) { > char *e; > > while (str < sentinel && > - (isspace((unsigned char)*str) || *str == '\0')) > + isspace((unsigned char)*str)) > str++; > if (str == sentinel) > break; > for (e = str; > - e < sentinel && !isspace((unsigned char)*e) && > - *e != '\0'; e++) > + e < sentinel && !isspace((unsigned char)*e); e++) > continue; > if (toggle) { > a = node_lookup(h, str, e); > @@ -362,6 +364,9 @@ read_hints(FILE *f, struct ohash *h, int > > while ((str = fgetln(f, &size)) != NULL) { > char *sentinel; > + > + if (memchr(str, '\0', size)) > + errx(1, "NUL byte detected in input"); > > sentinel = str + size; > for (;;) { > > > -- > jca More efficient than my patch. Just one small tweak: those functions take the file name, so the error message can mention it explicitly. Please do so like for other errx messages in there.