-
Kristian Evensen authored
When measuring the throughput (iperf3 + TCP) while routing on a not-so-powerful device (Mediatek MT7621, 880MHz CPU), I noticed that I achieved significantly lower speeds with QMI-based modems than for example a USB LAN dongle. The CPU was saturated in all of my tests. With the dongle I got ~300 Mbit/s, while I only measured ~200 Mbit/s with the modems. All offloads, etc. were switched off for the dongle, and I configured the modems to use QMAP (16k aggregation). The tests with the dongle were performed in my local (gigabit) network, while the LTE network the modems were connected to delivers 700-800 Mbit/s. Profiling the kernel revealed the cause of the performance difference. In qmimux_rx_fixup(), an SKB is allocated for each packet contained in the URB. This SKB has too little headroom, causing the check in skb_cow() (called from ip_forward()) to fail. pskb_expand_head() is then called and the SKB is reallocated. In the output from perf, I see that a significant amount of time is spent in pskb_expand_head() + support functions. In order to ensure that the SKB has enough headroom, this commit increases the amount of memory allocated in qmimux_rx_fixup() by LL_MAX_HEADER. The reason for using LL_MAX_HEADER and not a more accurate value, is that we do not know the type of the outgoing network interface. After making this change, I achieve the same throughput with the modems as with the dongle. Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20210106122403.1321180-1-kristian.evensen@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
2e423387