Download raw body.
iavf patch [4/4]: TX queue direct dispatch
After applying previous 3 patches, while I'm testing packet forwarding
via iavf interfaces, packet transfer performance is unstable. In the worst
case, ipgen (*1) results are shown as follows.
*1: https://github.com/iij/ipgen
framesize|0G 1G 2G 3G 4G 5G 6G 7G 8G 9G 10Gbps
---------+----+----+----+----+----+----+----+----+----+----+
64 |## 297.60Mbps, 581259/14880952pps, 3.91%
128 |### 592.77Mbps, 578873/ 8445945pps, 6.85%
512 |############# 2406.01Mbps, 587406/ 2349624pps, 25.00%
1024 |####################### 4597.74Mbps, 561247/ 1197318pps, 46.88%
1280 |############################# 5764.50Mbps, 562939/ 961538pps, 58.55%
1408 |############################### 6181.42Mbps, 548777/ 875350pps, 62.69%
1518 |################################## 6785.59Mbps, 558761/ 812743pps, 68.75%
The best case results are following.
mesize|0G 1G 2G 3G 4G 5G 6G 7G 8G 9G 10Gbps
---------+----+----+----+----+----+----+----+----+----+----+
64 |### 520.83Mbps, 1017253/14880952pps, 6.84%
128 |##### 945.95Mbps, 923776/ 8445945pps, 10.94%
512 |###################### 4325.96Mbps, 1056142/ 2349624pps, 44.95%
1024 |######################################### 8059.75Mbps, 983856/ 1197318pps, 82.17%
1280 |################################################## 9844.81Mbps, 961407/ 961538pps, 99.99%
1408 |############################################ 8625.06Mbps, 765719/ 875350pps, 87.48%
1518 |############################################### 9253.08Mbps, 761947/ 812743pps, 93.75%
These 2 cases have the same conditions. The same machine, same nic,
same kernel, just tested again.
While the testing, I see many IPIs from the `systat vm` result as follows.
```
1 users Load 0.75 0.22 0.08 openiavf.yuisoft.co 15:45:33
memory totals (in KB) PAGING SWAPPING Interrupts
real virtual free in out in out 186338 total
Active 34600 34600 7584080 ops com0
All 505348 505348 16489452 pages mpi0
uhci0
Proc:r d s w Csw Trp Sys Int Sof Flt forks iavf0
2 54 330462 147 20281 101 69 fkppw 16218 iavf0:0
fksvm iavf0:1
5.0%Int 5.0%Spn 24.5%Sys 0.0%Usr 65.5%Idle pwait iavf0:3
| | | | | | | | | | | relck iavf1
|||@@============ rlkok iavf1:0
noram 4063 iavf1:2
Namei Sys-cache Proc-cache No-cache ndcpy iavf1:3
Calls hits % hits % miss % fltcp 3 vmx0:0
zfod 1 vmx0:1
cow vmx0:2
Disks sd0 cd0 67411 fmin 3 vmx0:3
seeks 89881 ftarg 523 clock
xfers itarg 165527 ipi
speed 2 wired
sec pdfre
pdscn
pzidl 502484 IPKTS
12 kmape 502499 OPKTS
```
I'm running OpenBSD current on an ESXi virtual machine, so the IPI is
used for waking up an idle CPU. In a network driver, RX and TX queue
processing runs on a softnet taskque. I checked how many packets are
processed on these queues by adding TRACEPOINT in ifq_start and ifiq_process
functions. The ifq_start function triggers TX taskque and the ifiq_process
function is called in the RX taskque.
diff --git a/sys/net/ifq.c b/sys/net/ifq.c
index 3c3b141fb58..84913242965 100644
@@ -121,6 +122,8 @@ ifq_serialize(struct ifqueue *ifq, struct task *t)
void
ifq_start(struct ifqueue *ifq)
{
+ TRACEPOINT(ifq, start, ifq_len(ifq));
+
if (ifq_len(ifq) >= min(ifq->ifq_if->if_txmit, ifq->ifq_maxlen)) {
task_del(ifq->ifq_softnet, &ifq->ifq_bundle);
ifq_run_start(ifq);
@@ -862,6 +865,8 @@ ifiq_process(void *arg)
ml_init(&ifiq->ifiq_ml);
mtx_leave(&ifiq->ifiq_mtx);
+ TRACEPOINT(ifiq, process, ml_len(&ml));
+
if_input_process(ifiq->ifiq_if, &ml);
}
The btrace script is shown as follows.
```
tracepoint:ifq:start
{
@start = lhist(arg0, 0, 10, 1);
}
tracepoint:ifiq:process
{
@process = lhist(arg0, 0, 100, 10);
}
```
The btrace results are shown as follows.
```
@start:
[1, 2) 233557 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
@process:
[0, 10) 5341 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[10, 20) 1316 |@@@@@@@@@@@@ |
[20, 30) 893 |@@@@@@@@ |
[30, 40) 554 |@@@@@ |
[40, 50) 717 |@@@@@@ |
[50, 60) 83 | |
[60, 70) 23 | |
[70, 80) 168 |@ |
[80, 90) 112 |@ |
[90, 100) 1004 |@@@@@@@@@ |
```
This means that a TX taskque processes only 1 packet at once.
If the ifq_len returns 1, the TX taskqeue is always kicked by the
following code and an idle CPU will often be waken up.
```
void
ifq_start(struct ifqueue *ifq)
{
if (ifq_len(ifq) >= min(ifq->ifq_if->if_txmit, ifq->ifq_maxlen)) {
task_del(ifq->ifq_softnet, &ifq->ifq_bundle);
ifq_run_start(ifq);
} else
task_add(ifq->ifq_softnet, &ifq->ifq_bundle);
}
```
If ifq_len returns bigger or equal value than if_txmit,
TX taskq isn't kicked and ifq_run_start is dispatched directly.
So, I set if_txmit = 1 in the iavf driver. The packet forwarding
performance gets stable. The average performance is around 800k pps.
framesize|0G 1G 2G 3G 4G 5G 6G 7G 8G 9G 10Gbps
---------+----+----+----+----+----+----+----+----+----+----+
64 |### 440.85Mbps, 861035/14880952pps, 5.79%
128 |##### 883.72Mbps, 863011/ 8445945pps, 10.22%
512 |################# 3308.29Mbps, 807687/ 2349624pps, 34.38%
1024 |################################### 6895.36Mbps, 841719/ 1197318pps, 70.30%
1280 |############################################ 8623.26Mbps, 842115/ 961538pps, 87.58%
1408 |################################################# 9619.30Mbps, 853986/ 875350pps, 97.56%
1518 |############################################ 8636.21Mbps, 711150/ 812743pps, 87.50%
OK?
diff --git a/sys/dev/pci/if_iavf.c b/sys/dev/pci/if_iavf.c
index 204dbfc2637..bcf345de9ec 100644
--- a/sys/dev/pci/if_iavf.c
+++ b/sys/dev/pci/if_iavf.c
@@ -1052,6 +1052,7 @@ iavf_attach(struct device *parent, struct device *self, void *aux)
ifp->if_softc = sc;
ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
ifp->if_xflags = IFXF_MPSAFE;
+ ifp->if_txmit = 1;
ifp->if_ioctl = iavf_ioctl;
ifp->if_qstart = iavf_start;
ifp->if_watchdog = iavf_watchdog;
--
Yuichiro NAITO (naito.yuichiro@gmail.com)
iavf patch [4/4]: TX queue direct dispatch