Index | Thread | Search

From:
Florian Obser <florian@openbsd.org>
Subject:
Re: unwind: support wildcard in blacklist
To:
tech@openbsd.org
Date:
Sat, 14 Sep 2024 20:18:14 +0200

Download raw body.

Thread
I had hopped I'd find the time to review this, but it's been two weeks
and it didn't happen. It's getting too close to release so it won't make
it for 7.6.
I put it on my todo list to look at it mid October at the earliest. Feel
free to prod me if it doesn't happen then.

On 2024-08-30 00:34 +02, Kirill A. Korinsky <kirill@korins.ky> wrote:
> Hi Florian,
>
> On Sun, 25 Aug 2024 15:00:48 +0200,
> Florian Obser <florian@openbsd.org> wrote:
>> 
>> On 2024-08-24 11:25 UTC, Klemens Nanni <kn@openbsd.org> wrote:
>> > 06.07.2024 18:40, Kirill A. Korinsky пишет:
>> >> Folks,
>> >> 
>> >> Here a reminder about this diff.
>> >> 
>> >> I'm using it for about two weeks and it jsut works.
>> >> 
>> >> The diff changes symantic of blacklist into:
>> >> 
>> >>      type list file [log]
>> >
>> >        block|pass list file [log]
>> >
>> 
>> This sounds like a solution in search of a problem. I don't know a world
>> in which you can allow list a set of domains and expect things to work.
>>
>> Where do you get this list? Are you sending an email to
>> HOSTSMASTER@SRI-NIC.ARPA?
>>
>
> My needs is only wildcard in blacklist but the white list seems cheap to add
> and was suggested by Solena [1].
>
>> It also creates an incredible amount of churn, hiding the changes
>> needed for substring matching. And there are changes in there that I
>> don't understand.
>
> To make things easy, I had nuke that part of the diff. Now it is simpler and
> only implements whitelist.
>
>> 
>> I'm also worried that we're now reversing (twice for some reason?) every
>> qname. Is that cheap?
>>
>
> Two reverse by used algorithm costs as much as strlen(). Right now it
> iterates twice through dname: for strlen() and for two reverse(). I can
> rewrite it to a more complicated function that calculates the length of the
> string and does not reverse in place. But keep in mind that dname is limited
> to 255 characters and it should be fast enough.
>
>> I think the syntax of starting with a dot to mean any sub-label no
>> matter how deep is the least worst option.
>> 
>
> Footnotes:
> [1]  https://marc.info/?l=openbsd-tech&m=171926832626211&w=2
>
> An updated diff:
>
> diff --git sbin/unwind/frontend.c sbin/unwind/frontend.c
> index b10f9e384e5..5581af8ec41 100644
> --- sbin/unwind/frontend.c
> +++ sbin/unwind/frontend.c
> @@ -118,6 +118,8 @@ TAILQ_HEAD(, pending_query)	 pending_queries;
>  struct bl_node {
>  	RB_ENTRY(bl_node)	 entry;
>  	char			*domain;
> +	int			 len;
> +	int			 wildcard;
>  };
>  
>  __dead void		 frontend_shutdown(void);
> @@ -171,6 +173,19 @@ RB_GENERATE(bl_tree, bl_node, entry, bl_cmp)
>  struct dns64_prefix	*dns64_prefixes;
>  int			 dns64_prefix_count;
>  
> +static void
> +reverse(char *begin, char *end)
> +{
> +	char	t;
> +
> +	while (begin < --end) {
> +		t = *begin;
> +		*begin = *end;
> +		*end = t;
> +		++begin;
> +	}
> +}
> +
>  void
>  frontend_sig_handler(int sig, short event, void *bula)
>  {
> @@ -734,7 +749,7 @@ handle_query(struct pending_query *pq)
>  {
>  	struct query_imsg	 query_imsg;
>  	struct bl_node		 find;
> -	int			 rcode;
> +	int			 rcode, matched;
>  	char			*str;
>  	char			 dname[LDNS_MAX_DOMAINLEN + 1];
>  	char			 qclass_buf[16];
> @@ -791,8 +806,13 @@ handle_query(struct pending_query *pq)
>  	log_debug("%s: %s %s %s ?", ip_port((struct sockaddr *)&pq->from),
>  	    dname, qclass_buf, qtype_buf);
>  
> +	find.len = strlen(dname);
> +	find.wildcard = 0;
> +	reverse(dname, dname + find.len);
>  	find.domain = dname;
> -	if (RB_FIND(bl_tree, &bl_head, &find) != NULL) {
> +	matched = (RB_FIND(bl_tree, &bl_head, &find) != NULL);
> +	reverse(dname, dname + find.len);
> +	if (matched) {
>  		if (frontend_conf->blocklist_log)
>  			log_info("blocking %s", dname);
>  		error_answer(pq, LDNS_RCODE_REFUSED);
> @@ -1542,14 +1562,20 @@ parse_blocklist(int fd)
>  			if (linelen >= 2 && line[linelen - 2] != '.')
>  				line[linelen - 1] = '.';
>  			else
> -				line[linelen - 1] = '\0';
> +				line[linelen-- - 1] = '\0';
>  		}
>  
> +		if (line[0] == '#')
> +		    continue;
> +
>  		bl_node = malloc(sizeof *bl_node);
>  		if (bl_node == NULL)
>  			fatal("%s: malloc", __func__);
>  		if ((bl_node->domain = strdup(line)) == NULL)
>  			fatal("%s: strdup", __func__);
> +		reverse(bl_node->domain, bl_node->domain + linelen);
> +		bl_node->len = linelen;
> +		bl_node->wildcard = line[0] == '.';
>  		if (RB_INSERT(bl_tree, &bl_head, bl_node) != NULL) {
>  			log_warnx("duplicate blocked domain \"%s\"", line);
>  			free(bl_node->domain);
> @@ -1564,7 +1590,12 @@ parse_blocklist(int fd)
>  
>  int
>  bl_cmp(struct bl_node *e1, struct bl_node *e2) {
> -	return (strcasecmp(e1->domain, e2->domain));
> +	if (e1->wildcard == e2->wildcard)
> +		return (strcasecmp(e1->domain, e2->domain));
> +	else if (e1->wildcard)
> +		return (strncasecmp(e1->domain, e2->domain, e1->len));
> +	else /* e2->wildcard */
> +		return (strncasecmp(e1->domain, e2->domain, e2->len));
>  }
>  
>  void
> diff --git sbin/unwind/unwind.conf.5 sbin/unwind/unwind.conf.5
> index dbcec1bd693..0856249f5f7 100644
> --- sbin/unwind/unwind.conf.5
> +++ sbin/unwind/unwind.conf.5
> @@ -69,9 +69,10 @@ If a domain from this list is queried,
>  .Nm unwind
>  answers with a return code of
>  .Dv REFUSED .
> -With
>  .Cm log
> -blocked queries are logged.
> +blocked queries are logged. The list supports limited wildcard
> +syntax: domains starting with . (dot) are treated as any subdomains
> +on that zone.
>  .It Ic forwarder Brq Ar address Oo Ic port Ar number Oc Oo Oo Ic authentication name Ar name Oc Ic DoT Oc ...
>  A list of addresses of DNS name servers to forward queries to.
>  .Ic port
>
>
> -- 
>
> wbr, Kirill
>

-- 
In my defence, I have been left unsupervised.