Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- From: "Zhang, Yanmin" <yanmin_zhang@xxxxxxxxxxxxxxx>
- Date: Wed, 25 Feb 2009 15:20:23 +0800
On Wed, 2009-02-25 at 14:36 +0800, Herbert Xu wrote:
Zhang, Yanmin <yanmin_zhang@xxxxxxxxxxxxxxx> wrote:Thanks for your comments.
Subject: hand off skb list to other cpu to submit to upper layer
From: Zhang Yanmin <yanmin.zhang@xxxxxxxxxxxxxxx>
Recently, I am investigating an ip_forward performance issue with 10G IXGBE NIC.
I start the testing on 2 machines. Every machine has 2 10G NICs. The 1st one seconds
packets by pktgen. The 2nd receives the packets from one NIC and forwards them out
from the 2nd NIC. As NICs supports multi-queue, I bind the queues to different logical
cpu of different physical cpu while considering cache sharing carefully.
Comparing with sending speed on the 1st machine, the forward speed is not good, only
about 60% of sending speed. As a matter of fact, IXGBE driver starts NAPI when interrupt
arrives. When ip_forward=1, receiver collects a packet and forwards it out immediately.
So although IXGBE collects packets with NAPI, the forwarding really has much impact on
collection. As IXGBE runs very fast, it drops packets quickly. The better way for
receiving cpu is doing nothing than just collecting packets.
This doesn't make sense. With multiqueue RX, every core should beI never say the core can't receive and forward packets at the same time.
working to receive its fraction of the traffic and forwarding them
out.
I mean the performance isn't good.
So you shouldn't have any idle cores to begin with. The factpairs by which harware delivers packets to different queues. we couldn't expect
that you do means that multiqueue RX hasn't maximised its utility,
so you should tackle that instead of trying redirect traffic away
from the cores that are receiving.
From Stephen's explanation, the packets are being sent with different SRC/DST address
NIC always puts packets into queues evenly.
The behavior is IXGBE is very fast and cpu couldn't collect packets in time if it
collects packets and forwards them at the same time. That causes IXGBE drops packets.
IXGBE NIC does support a large number of RX queues. By default, it creates
Of course for NICs that don't support multiqueue RX, or where the
number of RX queues is less than the number of cores, then a scheme
like yours may be useful.
CPU_NUM queues. But the performance is not good when we bind queues to
cpu evenly. One reason is cache miss/ping-pong. The forwarder machine has
2 physical cpu and every cpu has 8 logical threads. All 8 logical cpu share
the last level cache. With my ip_forward testing by pktgen, binding queues
to 8 logical cpu of a physical cpu could have 40% improvement than binding
queues to 16 logical cpu. So the optimization scenario just needs IXGBE drivers
create 8 queues.
If the machines might have a couple of NICs and every NIC has CPU_NUM queues,
binding them evenly might cause more cache-miss/ping-pong. I didn't test
multiple receiving NICs scenario as I couldn't get enough hardware.
Yanmin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- From: David Miller
- Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- References:
- Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- From: Herbert Xu
- Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- Prev by Date: Re: [rfc] headers_check cleanups break the whole world
- Next by Date: Re: [rfc] headers_check cleanups break the whole world
- Previous by thread: Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- Next by thread: Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
- Index(es):
Relevant Pages
|