Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD



Peter Zijlstra wrote:

You say "critical resource isolation", but it is not the case - consider
NFS over UDP - remote side will not stop sending just because receiving socket code drops data due to OOM, or IPsec or compression, which can
requires reallocation. There is no "critical resource isolation", since
reserved pool _must_ be used by everyone in the kernel network stack.

The idea is to drop all !NFS packets (or even more specific only keep
those NFS packets that belong to the critical mount), and everybody
doing critical IO over layered networks like IPSec or other tunnel
constructs asks for trouble - Just DON'T do that.

The only problem with things like IPSec is renegotiation, which
can take up memory right at the time you don't have any extra
memory available.

Decrypting individual IPSec packets during normal operation and
then dropping the ones for non-critical sockets should work just
fine.

The problem is layered networks over TCP, where you have to
process the packets in-order and may have no choice but to hold
onto data for non-critical sockets, at least for a while.

Dropping these non-essential packets makes sure the reserve memory doesn't get stuck in some random blocked user-space process, hence
you can make progress.

In short:
- every incoming packet needs to be received at the packet level
- when memory is low, we only deliver data to memory critical sockets
- packets to other sockets get dropped, so the memory can be reused
for receiving other packets, including the packets needed for the
memory critical sockets to make progress

Forwarding packets while in low memory mode should not be a problem
at all, since forwarded packets get freed quickly.

The memory pool for receiving packets does not need much accounting
of any kind, since every packet will end up coming from that pool
when normal allocations start failing. Maybe Evgeniy's allocator
can do something smarter internally, and mark skbuffs as MEMALLOC
when the number of available skbuffs is getting low?

Part (most?) of the problem space is explained here:

http://linux-mm.org/NetworkStorageDeadlock

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD
    ... when memory is low, we only deliver data to memory critical sockets ... packets to other sockets get dropped, so the memory can be reused ... so network needs special allocator in OOM ...
    (Linux-Kernel)
  • Re: [PATCH 00/28] Swap over NFS -v16
    ... To do so we need to distinguish needed from unneeded packets; ... our state must not consume memory, ... a/ in caches, such as the fragment cache and the route cache ...
    (Linux-Kernel)
  • Re: [PATCH 00/28] Swap over NFS -v16
    ... of dirty memory so that when we desperately need memory on the ... any incoming packets. ... So suppose we forgot about all the allocation tracking (that doesn't ... A packet is received, it can be a fragment, it will be placed in the ...
    (Linux-Kernel)
  • Re: Consistent performance issues at high bandwidths, UDP.
    ... all createing multiple sockets should do is use more non-paged memory ... The only difference is the number of sockets and how many packets ... and every 3.1 seconds a frame took 3.8 seconds to send. ... inconsistent frame rates caused by a periodic delay in the call ...
    (microsoft.public.win32.programmer.networks)
  • Re: [PATCH 00/28] Swap over NFS -v16
    ... To do so we need to distinguish needed from unneeded packets; ... our state must not consume memory, ... a/ in caches, such as the fragment cache and the route cache ...
    (Linux-Kernel)