Re: RFC: issues concerning the next NAPI interface



On Fri, Aug 24, 2007 at 09:04:56PM +0200, Bodo Eggert wrote:
Linas Vepstas <linas@xxxxxxxxxxxxxx> wrote:
On Fri, Aug 24, 2007 at 03:59:16PM +0200, Jan-Bernd Themann wrote:
3) On modern systems the incoming packets are processed very fast. Especially
on SMP systems when we use multiple queues we process only a few packets
per napi poll cycle. So NAPI does not work very well here and the interrupt
rate is still high.

worst-case network ping-pong app: send one
packet, wait for reply, send one packet, etc.

Possible solution / possible brainfart:

Introduce a timer, but don't start to use it to combine packets unless you
receive n packets within the timeframe. If you receive less than m packets
within one timeframe, stop using the timer. The system should now have a
decent response time when the network is idle, and when the network is
busy, nobody will complain about the latency.-)

Ohh, that was inspirational. Let me free-associate some wild ideas.

Suppose we keep a running average of the recent packet arrival rate,
Lets say its 10 per millisecond ("typical" for a gigabit eth runnning
flat-out). If we could poll the driver at a rate of 10-20 per
millisecond (i.e. letting the OS do other useful work for 0.05 millisec),
then we could potentially service the card without ever having to enable
interrupts on the card, and without hurting latency.

If the packet arrival rate becomes slow enough, we go back to an
interrupt-driven scheme (to keep latency down).

The main problem here is that, even for HZ=1000 machines, this amounts
to 10-20 polls per jiffy. Which, if implemented in kernel, requires
using the high-resolution timers. And, umm, don't the HR timers require
a cpu timer interrupt to make them go? So its not clear that this is much
of a win.

The eHEA is a 10 gigabit device, so it can expect 80-100 packets per
millisecond for large packets, and even more, say 1K packets per
millisec, for small packets. (Even the spec for my 1Gb spidernet card
claims its internal rate is 1M packets/sec.)

Another possiblity is to set HZ to 5000 or 20000 or something humongous
... after all cpu's are now faster! But, since this might be wasteful,
maybe we could make HZ be dynamically variable: have high HZ rates when
there's lots of network/disk activity, and low HZ rates when not. That
means a non-constant jiffy.

If all drivers used interrupt mitigation, then the variable-high
frequency jiffy could take thier place, and be more "fair" to everyone.
Most drivers would be polled most of the time when they're busy, and
only use interrupts when they're not.

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: wireless card driver problem about the "done" bit in the descriptor && the bail caused b
    ... And i found that after receiving about 20 packets, ... or a "rxdp bail", the whileloop break, while no interrupt to make ... it off and do a jobQueuePostto wake up tNet0. ... can usually see which descriptors need processing by testing a status ...
    (comp.os.vxworks)
  • Re: High resolution timer on kernel mode
    ... timer all of the time ... ... So for me if the packets are coming in very slowly ... ... generated but nic harware interrupt generated. ... alternatives for NDIS drivers that are not NDIS-WDM. ...
    (microsoft.public.development.device.drivers)
  • Re: wireless card driver problem about the "done" bit in the descriptor && the bail caused b
    ... the VxWorks, ... And i found that after receiving about 20 packets, ... or a "rxdp bail", the whileloop break, while no interrupt to make ... it off and do a jobQueuePostto wake up tNet0. ...
    (comp.os.vxworks)
  • Re: RFC: issues concerning the next NAPI interface
    ... Schedule the timer in the poll function ... The timer function could then just call netif_rx_schedule to ... Interrupt mitigation only works if it helps you avoid interrupts. ... So if you set this timer, it triggers, and no packets arrive, you are ...
    (Linux-Kernel)
  • Re: bge dropping packets issue
    ... interrupt driven network I/O? ... Even the Linux driver uses higher number of RX descriptors than ... per second -- missed polls are more likely at higher frequencies. ... talking 100MBps the firmware is dropping packets). ...
    (freebsd-net)