Index | Thread | Search

From:
Yuichiro NAITO <naito.yuichiro@gmail.com>
Subject:
iavf patch [4/4]: TX queue direct dispatch
To:
tech@openbsd.org
Date:
Fri, 07 Feb 2025 16:26:34 +0900

Download raw body.

Thread
After applying previous 3 patches, while I'm testing packet forwarding
via iavf interfaces, packet transfer performance is unstable. In the worst
case, ipgen (*1) results are shown as follows.

*1: https://github.com/iij/ipgen

framesize|0G  1G   2G   3G   4G   5G   6G   7G   8G   9G   10Gbps
---------+----+----+----+----+----+----+----+----+----+----+
      64 |##                                                   297.60Mbps,   581259/14880952pps,   3.91%
     128 |###                                                  592.77Mbps,   578873/ 8445945pps,   6.85%
     512 |#############                                       2406.01Mbps,   587406/ 2349624pps,  25.00%
    1024 |#######################                             4597.74Mbps,   561247/ 1197318pps,  46.88%
    1280 |#############################                       5764.50Mbps,   562939/  961538pps,  58.55%
    1408 |###############################                     6181.42Mbps,   548777/  875350pps,  62.69%
    1518 |##################################                  6785.59Mbps,   558761/  812743pps,  68.75%

The best case results are following.

mesize|0G  1G   2G   3G   4G   5G   6G   7G   8G   9G   10Gbps
---------+----+----+----+----+----+----+----+----+----+----+
      64 |###                                                  520.83Mbps,  1017253/14880952pps,   6.84%
     128 |#####                                                945.95Mbps,   923776/ 8445945pps,  10.94%
     512 |######################                              4325.96Mbps,  1056142/ 2349624pps,  44.95%
    1024 |#########################################           8059.75Mbps,   983856/ 1197318pps,  82.17%
    1280 |##################################################  9844.81Mbps,   961407/  961538pps,  99.99%
    1408 |############################################        8625.06Mbps,   765719/  875350pps,  87.48%
    1518 |###############################################     9253.08Mbps,   761947/  812743pps,  93.75%

These 2 cases have the same conditions. The same machine, same nic,
same kernel, just tested again.

While the testing, I see many IPIs from the `systat vm` result as follows.

```
   1 users Load 0.75 0.22 0.08                     openiavf.yuisoft.co 15:45:33

            memory totals (in KB)            PAGING   SWAPPING     Interrupts
           real   virtual     free           in  out   in  out   186338 total
Active    34600     34600  7584080   ops                                com0
All      505348    505348 16489452   pages                              mpi0
                                                                        uhci0
Proc:r  d  s  w    Csw   Trp   Sys   Int   Sof  Flt       forks         iavf0
     2    54    330462         147 20281   101   69       fkppw   16218 iavf0:0
                                                          fksvm         iavf0:1
   5.0%Int   5.0%Spn  24.5%Sys   0.0%Usr  65.5%Idle       pwait         iavf0:3
|    |    |    |    |    |    |    |    |    |    |       relck         iavf1
|||@@============                                         rlkok         iavf1:0
                                                          noram    4063 iavf1:2
Namei         Sys-cache    Proc-cache    No-cache         ndcpy         iavf1:3
    Calls     hits    %    hits     %    miss   %         fltcp       3 vmx0:0
                                                          zfod        1 vmx0:1
                                                          cow           vmx0:2
Disks   sd0   cd0                                   67411 fmin        3 vmx0:3
seeks                                               89881 ftarg     523 clock
xfers                                                     itarg  165527 ipi
speed                                                   2 wired
  sec                                                     pdfre
                                                          pdscn
                                                          pzidl  502484 IPKTS
                                                       12 kmape  502499 OPKTS
```

I'm running OpenBSD current on an ESXi virtual machine, so the IPI is
used for waking up an idle CPU. In a network driver, RX and TX queue
processing runs on a softnet taskque. I checked how many packets are
processed on these queues by adding TRACEPOINT in ifq_start and ifiq_process
functions. The ifq_start function triggers TX taskque and the ifiq_process
function is called in the RX taskque.

diff --git a/sys/net/ifq.c b/sys/net/ifq.c
index 3c3b141fb58..84913242965 100644
@@ -121,6 +122,8 @@ ifq_serialize(struct ifqueue *ifq, struct task *t)
 void
 ifq_start(struct ifqueue *ifq)
 {
+	TRACEPOINT(ifq, start, ifq_len(ifq));
+
 	if (ifq_len(ifq) >= min(ifq->ifq_if->if_txmit, ifq->ifq_maxlen)) {
 		task_del(ifq->ifq_softnet, &ifq->ifq_bundle);
 		ifq_run_start(ifq);
@@ -862,6 +865,8 @@ ifiq_process(void *arg)
 	ml_init(&ifiq->ifiq_ml);
 	mtx_leave(&ifiq->ifiq_mtx);
 
+	TRACEPOINT(ifiq, process, ml_len(&ml));
+
 	if_input_process(ifiq->ifiq_if, &ml);
 }
 
The btrace script is shown as follows.

```
tracepoint:ifq:start
{
	@start = lhist(arg0, 0, 10, 1);
}

tracepoint:ifiq:process
{
	@process = lhist(arg0, 0, 100, 10);
}
```

The btrace results are shown as follows.

```
@start:
[1, 2)        233557 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
@process:
[0, 10)         5341 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[10, 20)        1316 |@@@@@@@@@@@@                                        |
[20, 30)         893 |@@@@@@@@                                            |
[30, 40)         554 |@@@@@                                               |
[40, 50)         717 |@@@@@@                                              |
[50, 60)          83 |                                                    |
[60, 70)          23 |                                                    |
[70, 80)         168 |@                                                   |
[80, 90)         112 |@                                                   |
[90, 100)       1004 |@@@@@@@@@                                           |
```

This means that a TX taskque processes only 1 packet at once.
If the ifq_len returns 1, the TX taskqeue is always kicked by the
following code and an idle CPU will often be waken up.

```
void
ifq_start(struct ifqueue *ifq)
{
	if (ifq_len(ifq) >= min(ifq->ifq_if->if_txmit, ifq->ifq_maxlen)) {
		task_del(ifq->ifq_softnet, &ifq->ifq_bundle);
		ifq_run_start(ifq);
	} else
		task_add(ifq->ifq_softnet, &ifq->ifq_bundle);
}
```

If ifq_len returns bigger or equal value than if_txmit,
TX taskq isn't kicked and ifq_run_start is dispatched directly.

So, I set if_txmit = 1 in the iavf driver. The packet forwarding
performance gets stable. The average performance is around 800k pps.

framesize|0G  1G   2G   3G   4G   5G   6G   7G   8G   9G   10Gbps
---------+----+----+----+----+----+----+----+----+----+----+
      64 |###                                                  440.85Mbps,   861035/14880952pps,   5.79%
     128 |#####                                                883.72Mbps,   863011/ 8445945pps,  10.22%
     512 |#################                                   3308.29Mbps,   807687/ 2349624pps,  34.38%
    1024 |###################################                 6895.36Mbps,   841719/ 1197318pps,  70.30%
    1280 |############################################        8623.26Mbps,   842115/  961538pps,  87.58%
    1408 |#################################################   9619.30Mbps,   853986/  875350pps,  97.56%
    1518 |############################################        8636.21Mbps,   711150/  812743pps,  87.50%

OK?

diff --git a/sys/dev/pci/if_iavf.c b/sys/dev/pci/if_iavf.c
index 204dbfc2637..bcf345de9ec 100644
--- a/sys/dev/pci/if_iavf.c
+++ b/sys/dev/pci/if_iavf.c
@@ -1052,6 +1052,7 @@ iavf_attach(struct device *parent, struct device *self, void *aux)
 	ifp->if_softc = sc;
 	ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
 	ifp->if_xflags = IFXF_MPSAFE;
+	ifp->if_txmit = 1;
 	ifp->if_ioctl = iavf_ioctl;
 	ifp->if_qstart = iavf_start;
 	ifp->if_watchdog = iavf_watchdog;

-- 
Yuichiro NAITO (naito.yuichiro@gmail.com)