Re: [PATCH 00/28] Swap over NFS -v16



On Friday February 29, a.p.zijlstra@xxxxxxxxx wrote:

The tx path is a bit fuzzed. I assume it has an upper limit, take a stab
at that upper limit and leave it at that.

It should be full of holes, and there is some work on writeout
throttling to fill some of them - but I haven't seen any lockups in this
area for a long long while.

I think this is very interesting and useful.

You seem to be saying that the write-throttling is enough to avoid any
starvation on the transmit path. i.e. the VM is limiting the amount
of dirty memory so that when we desperately need memory on the
writeout path we can always get it, without lots of careful
accounting.

So why doesn't this work on for the receive side? What - exactly - is
the difference.

I think the difference is that on the receive side we have to hand
out memory before we know how it will be used (i.e. whether it is for
a SK_MEMALLOC socket or not) and so emergency memory could get stuck
in some non-emergency usage.

So suppose we forgot about all the allocation tracking (that doesn't
seem to be needed on the send side so maybe isn't on the receive side)
and just focus on not letting emergency memory get used for the wrong
thing.

So: Have some global flag that says "we are into the emergency pool"
which gets set the first time an allocation has to dip below the low
water mark, and cleared when an allocation succeeds without needing to
dip that low.

Then whenever we have memory that might have been allocated from below
the watermark (i.e. an incoming packet) and we find out that it isn't
required for write-out (i.e. it gets attached to a socket for which
SK_MEMALLOC is not set) we test the global flag and if it is set, we
drop the packet and free the memory.

To clarify:
no new accounting
no new reservations
just "drop non-writeout packets when we are low on memory"

Is there any chance that could work reliably?

I call this the "Linus" approach because I remember reading somewhere
(that google cannot find for me) where Linus said he didn't think a
provably correct implementation was the way to go - just something
that made it very likely that we won't run out of memory at an awkward
time.

I guess my position is that any accounting that we do needs to have a
clear theoretical model underneath it so we can reason about it.
I cannot see the clear model in beneath the current code so I'm trying
to come up with models that seem to capture the important elements of
the code, and to explore them.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • [patch 2.6.12-rc3] dell_rbu: Resubmitting patch for new Dell BIOS update driver
    ... +The BIOS update is done by writing the new BIOS image in to contiguous physical ... +memory addressable by the BIOS. ... +The disadvantage of contiguous allocation is that it may not be always possible ... * break the BIOS image in to fixed sized packet chunks and each packet is written ...
    (Linux-Kernel)
  • Re: How to release heap memory that is marked as free
    ... As I said, fragmentation is a very serious problem, and one of the most serious problems ... my allocator was accused of using massive amounts of memory. ... I'm going to have to re-think the memory allocation that I'm ... process's 'working set'. ...
    (microsoft.public.vc.mfc)
  • Re: [PATCH 00/28] Swap over NFS -v16
    ... memory they can consume. ... So we need the extra (skb) ... included in the reserve? ... if the allocation had to dip into emergency reserves, ...
    (Linux-Kernel)
  • Re: Memory leak with CAsyncSocket::Create
    ... read my essay on how storage allocators work. ... Create method is consuming system memory that is not released back to ... The memory consumption is either shown as "Mem Usage" on the Task ... many levels of allocation going ...
    (microsoft.public.vc.mfc)
  • Re: OT: C++ overloading operators
    ... dynamic allocation, no matter how many "clever tricks" are used... ... though there's enough memory in the system, ... all these "flexible data types" map into CPU command ... The computing environment is completely ...
    (comp.dsp)