Index | Thread | Search

From:
Bjorn Ketelaars <bket@openbsd.org>
Subject:
Re: fq_codel: align with RFC 8289 and RFC 8290 (2/2)
To:
Alexandr Nedvedicky <sashan@fastmail.net>
Cc:
tech@openbsd.org, mikeb@openbsd.org
Date:
Tue, 11 Nov 2025 22:07:26 +0100

Download raw body.

Thread
On Tue 11/11/2025 14:12, Alexandr Nedvedicky wrote:
> Hello,
> 
> the changes look mostly good. I think there is one finishing touch needed.
> Perhaps someone with more solid background on network interfaces will either
> confirm (or wave away) my concern. 
> 
> </snip>
> On Mon, Oct 27, 2025 at 08:27:07PM +0100, Bjorn Ketelaars wrote:
> > diff --git sys/net/fq_codel.c sys/net/fq_codel.c
> > index ad8ed7022c4..af2ea41e86d 100644
> > --- sys/net/fq_codel.c
> > +++ sys/net/fq_codel.c
> > @@ -116,6 +116,8 @@ struct fqcodel {
> >  	/* stats */
> >  	struct fqcodel_pktcntr	 xmit_cnt;
> >  	struct fqcodel_pktcntr	 drop_cnt;
> > +
> > +	struct mbuf_list	 pending_drops;
> >  };
> 
>     I think newly introduced list should be also dropped in
>     in fqcodel_purge() function which is being called
>     when interface goes down. Missing 'ml_purge(&fqc->pending_drops);'
>     in fqcodel_purge() function is the only nit I could spot.
> 
>     otherwise diff looks OK to me. Let's wait if someone else will
>     chime in here.


Thank you for your feedback, and for catching this! I think you are
correct that pending_drops needs to be handled in fqcodel_purge().
However, I think we should use ml_enlist(ml, &fqc->pending_drops) rather
than ml_purge(&fqc->pending_drops).

My reasoning is that fqcodel_purge() follows the same pattern as
codel_purge(), it takes an ml parameter specifically to collect packets
that need to be freed, and the caller is responsible for actually
freeing them. Using ml_enlist() maintains this approach and is
consistent with how codel_purge() handles its queue.

The ml_purge() call is appropriate in fqcodel_pf_free() because that is
the final cleanup where the structure is teared down entirely, not a
purge operation that needs to return packets to a caller.

Updated diff below. Makes sense to have an extra pair of eyes check it.


diff --git sys/net/fq_codel.c sys/net/fq_codel.c
index ad8ed7022c4..1ddb90afcff 100644
--- sys/net/fq_codel.c
+++ sys/net/fq_codel.c
@@ -116,6 +116,8 @@ struct fqcodel {
 	/* stats */
 	struct fqcodel_pktcntr	 xmit_cnt;
 	struct fqcodel_pktcntr	 drop_cnt;
+
+	struct mbuf_list	 pending_drops;
 };
 
 unsigned int	 fqcodel_idx(unsigned int, const struct mbuf *);
@@ -529,7 +531,7 @@ fqcodel_enq(struct fqcodel *fqc, struct mbuf *m)
 	struct flow *flow;
 	unsigned int backlog = 0;
 	int64_t now;
-	int i;
+	int i, ndrop;
 
 	flow = classify_flow(fqc, m);
 	if (flow == NULL)
@@ -548,8 +550,20 @@ fqcodel_enq(struct fqcodel *fqc, struct mbuf *m)
 	}
 
 	/*
-	 * Check the limit for all queues and remove a packet
-	 * from the longest one.
+	 * Flush pending_drops first to let PF account them individually.
+	 * When batch dropping (below), we queue multiple packets but can
+	 * only return one per enqueue call to maintain the interface
+	 * contract.
+	 */
+	if (!ml_empty(&fqc->pending_drops))
+		return (ml_dequeue(&fqc->pending_drops));
+
+	/*
+	 * If total queue length exceeds the limit, find the flow with the
+	 * largest backlog and drop up to half of its packets, with a
+	 * maximum of 64, from the head. Implements RFC 8290, section 4.1
+	 * batch drop to handle overload efficiently. Dropped packets are
+	 * queued in pending_drops.
 	 */
 	if (fqc->qlength > fqc->qlimit) {
 		for (i = 0; i < fqc->nflows; i++) {
@@ -560,16 +574,23 @@ fqcodel_enq(struct fqcodel *fqc, struct mbuf *m)
 		}
 
 		KASSERT(flow != NULL);
-		m = codel_commit(&flow->cd, NULL);
 
-		fqc->drop_cnt.packets++;
-		fqc->drop_cnt.bytes += m->m_pkthdr.len;
+		ndrop = MIN(MAX(codel_qlength(&flow->cd) / 2, 1), 64);
 
-		fqc->qlength--;
+		for (i = 0; i < ndrop; i++) {
+			m = codel_commit(&flow->cd, NULL);
+			if (m == NULL)
+				break;
+			fqc->drop_cnt.packets++;
+			fqc->drop_cnt.bytes += m->m_pkthdr.len;
+			fqc->qlength--;
+			ml_enqueue(&fqc->pending_drops, m);
+		}
 
-		DPRINTF("%s: dropping from flow %u\n", __func__,
-		    flow->id);
-		return (m);
+		DPRINTF("%s: batch-dropped %d/%d pkts from flow %u\n", __func__,
+		    i, ndrop, flow->id);
+
+		return (ml_dequeue(&fqc->pending_drops));
 	}
 
 	return (NULL);
@@ -687,6 +708,7 @@ fqcodel_purge(struct fqcodel *fqc, struct mbuf_list *ml)
 
 	for (i = 0; i < fqc->nflows; i++)
 		codel_purge(&fqc->flows[i].cd, ml);
+	ml_enlist(ml, &fqc->pending_drops);
 	fqc->qlength = 0;
 }
 
@@ -728,6 +750,7 @@ fqcodel_pf_alloc(struct ifnet *ifp)
 
 	SIMPLEQ_INIT(&fqc->newq);
 	SIMPLEQ_INIT(&fqc->oldq);
+	ml_init(&fqc->pending_drops);
 
 	return (fqc);
 }
@@ -782,6 +805,7 @@ fqcodel_pf_free(void *arg)
 {
 	struct fqcodel *fqc = arg;
 
+	ml_purge(&fqc->pending_drops);
 	codel_freeparams(&fqc->cparams);
 	free(fqc->flows, M_DEVBUF, fqc->nflows * sizeof(struct flow));
 	free(fqc, M_DEVBUF, sizeof(struct fqcodel));