Re: [PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme



On Thu, Oct 09, 2008 at 11:55:59AM +1100, Rusty Russell wrote:

There are three approaches we should investigate before adding YA feature.
Obviously, we can simply increase the number of ring entries.

That's not going to work so well as you need to increase the ring
size by MAX_SKB_FRAGS times to achieve the same level of effect.

Basically the current scheme is either going to suck at non-TSO
traffic or it's going to chew too much resources.

Secondly, we can put the virtio_net_hdr at the head of the skb data (this is
also worth considering for xmit I think if we have headroom) and drop
MAX_SKB_FRAGS which contains a gratuitous +2.

That's fine but having skb->data in the ring still means two
different kinds of memory in there and it sucks when you only
have 1500-byte packets.

Thirdly, we can try to coalesce contiguous buffers. The page caching scheme
we have might help here, I don't know. Maybe we should be explicitly trying
to allocate higher orders.

That's not really the key problem here. The problem here is
that the scheme we're currently using in virtio-net is simply
broken when it comes to 1500-byte sized packets. Most of the
entries on the ring buffer go to waste.

We need a scheme that handles both 1500-byte packets as well
as 64K-byte size ones, and without holding down 16M of memory
per guest.

The size of the logical buffer is
returned to the guest rather than the size of the individual smaller
buffers.

That's a virtio transport breakage: can you use the standard virtio mechanism,
just put the extended length or number of extra buffers inside the
virtio_net_hdr?

Sure that sounds reasonable.

Make use of this support by supplying single page receive buffers to
the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
the payload to the skb's linear data buffer and adjust the fragment
offset to point to the remaining data. This ensures proper alignment
and allows us to not use any paged data for small packets. If the
payload occupies multiple pages, we simply append those pages as
fragments and free the associated skbs.

+ char *p = page_address(skb_shinfo(skb)->frags[0].page);
...
+ memcpy(hdr, p, sizeof(*hdr));
+ p += sizeof(*hdr);

I think you need kmap_atomic() here to access the page. And yes, that will
effect performance :(

No we don't. kmap would only be necessary for highmem which we
did not request.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme
    ... we can simply increase the number of ring entries. ... buffer and 18 page sized buffers). ... In the case of MTU sized packets from an off-host source, ...
    (Linux-Kernel)
  • Re: Office 2004
    ... packets transmitted, 3 packets received, 0% packet loss ... The first thing you do is ring up and explain what is wrong. ... never will, because they know almost nothing about networking, or anything ... McGhie Information Engineering Pty Ltdhttp://jgmcghie.fastmail.com.au/ ...
    (microsoft.public.mac.office)
  • Re: [take19 1/4] kevent: Core files.
    ... I consider a very bad idea to hardcode the size of the ring ... but it is required to remove overflow in mapped buffer. ... kqueue_dequeue_readyis atomic and this function removes kevent from ... One thread uses ring buffer entry, ...
    (Linux-Kernel)
  • Re: [Fwd: Re: bge Ierr rate increase from 5.3R -> 6.1R]
    ... more packets per second than they can handle. ... as input errors). ... since allocating mbufs would cost 1MB of memory and 1MB was ... ring without worrying that this takes much more than 1MB. ...
    (freebsd-net)
  • Re: [PATCH] virtio_net: free transmit skbs in a timer
    ... notification when there are no more packets in xmit ring would be ... On xmit, the driver ... cleans up any old used packets before trying to send anyway. ... Right, this would be a threshold that the host would set, approx. ...
    (Linux-Kernel)