Re: running Linux with no swap space (but lots of RAM)



On Fri, 30 Nov 2007 16:13:46 +0100 Rainer Weikusat <rweikusat@xxxxxxxxxxx> wrote:
| phil-news-nospam@xxxxxxxx writes:
|> On Thu, 29 Nov 2007 15:55:33 +0100 Rainer Weikusat <rweikusat@xxxxxxxxxxx> wrote:
|> | phil-news-nospam@xxxxxxxx writes:
|> |
|> | [...]
|> |
|> |> | e> In that situation it is quite likely that you don't have "some
|> |> | e> other device" to swap to. That's probably why you are booting
|> |> | e> from flash in the first place.
|> |> |
|> |> | The problem in your new hypothetical is then that you have no device
|> |> | capable of tolerating paging I/O, not that the system is paging to
|> |> | disc.
|> |>
|> |> Please explain what you mean by "tolerating paging I/O".
|> |
|> | In plain words, this means 'only devices whose actual behaviour and
|> | limitations the OP understands even less well than he believes to
|> | understand disks'.
|> |
|> | [...]
|> |
|> |> A frequent scenario I see happens when my need for memory by user space
|> |> programs is below the capacity is available, and would not have even begun
|> |> to swap anything. A program is run that will be doing a large amount of
|> |> I/O output, such as copying 100+ GB of files between filesystems. The I/O
|> |> buffering goes beyond just the pages that are free. The buffering logic
|> |> tries to buffer far more of those 100+ GB than needed to keep the writing
|> |> drive continuously busy or even to minimize head seeks.
|> |
|> | The purpose of the page cache is neither 'to keep the drive
|> | continously busy' nor 'to minimze head seeks'. Both would be tasks the
|> | elevator (or I/O-scheduler) is supposed to accomplish.
|>
|> Some amount of caching is necessary to achieve such I/O scheduling.
|
| For fairly obvious reasons, I/O-scheduling can only take place if
| there is actual something to schedule.

And your point in this statement is what? That I should avoid doing any
I/O operations?


|> | Just for an informal test, I have just created an archive of all of my
|> | filesystem to /dev/null (~16G). The amount of memory allocated to
|> | 'buffers' and 'cached' (vmstat) peaked well below 95000K and 45000K,
|> | respectively. No paging activities occured during creation of the
|> | archive.
|>
|> Well of course no paging activities occured during creation of the archive.
|> This is a bogus test.
|
| It is a test which should result in the amount of buffered disk
| contents growing 'without bounds', because all those files need to be read
| into memory.

Reading is does not create the same situation as writing out. What the
kernel might buffer for reading is not dirty, and is trivial to abandon.
What gets read into the process buffer is just swappable VM space.

A write operation to a real device is where things get hard on the system.
The write buffering is dirty. That is, the kernel can't just abandon it
under the premise it could read it in again later if needed. It must be
kept until it is written. At some point, the buffering of these writes
starts to take non-dirty pages away from other things. Later it adds more
pressure to force other dirty pages to be written out, even while these
very I/O pages need to be written out as well. Eventually the process
doing the write calls gets blocked. By that time a huge amount of RAM is
taken up by all these write buffers. And it seems a lot of them stay even
after they've been written to the device.

Writing to /dev/null doesn't incur this problem. Those writes complete
instantly.


|> | BTW, the by-and-far easiest path to personal happiness for you in this
|> | respect is to just misconfigure your system to your hearts content
|> | (its yours, after all) instead of talking about hypothesises you have
|> | about situations which - for some strange reason - are not that
|> | generally reproducible than the generality of your inferences would
|> | require.
|>
|> Someone who thinks writing to /dev/null would result in lots of data being
|> queued for writing to the physical device is not someone I would care to
|> consider technical advice from.
|
| I didn't write that I assumed reading lots of disk files and writing
| their contents to /dev/null would result in lots of disk writes. From
| the point of view of the cache involved here, 'reading' and 'writing'
| make no difference: both cause memory pages to be filled with the
| contents of files which could later on be used to serve other read
| requests from memory or to collapse multiple writes to the same area
| of a file into a single disk write, because all except the last only
| wrote to the cached data.

Then you seem to be off track and talking about some other issue and not the
one I'm raising. I'm talking about what gets buffered in the kernel during
write operations originating at user processes going to real physical devices.


|> | It would still be more sensible to add RAM until you don't experience
|> | regular paging activities occuring for some unknown reason on your
|> | system and leave the virtual memory configuration as-is to deal with
|> | non-regular situations. Or consider reducing your working set.
|>
|> I already know that the amount of RAM to avoid the problem is too radical
|> to consider. And I have yet to find a mainboard that supports 2 CPUs and
|> 2 TB of RAM and fits in an ATX case using no more than 550 watts of power.
|>
|> Despite your bogus test, the reality is that even with a working set well
|> below the amount of actual RAM present, I/O writing (to real devices) will
|> gradually force out pages.
|
| Writing a 6.7G file on my 512M development machine did not do so. Not
| that a difference between reading and writing was to be expected in
| this respect.

Try for a 67G file.


|> A swapless system is a partial answer to this problem. My other
|> idea is to have even more RAM, set up a big RAMFS, and at system
|> initialization, copy /opt and /usr into there, and bind mount
|
|> that over the real /opt and /usr, in read-only mode. That would be another
|> 8GB of RAM, bringing my system that would be well configured at 4GB in the
|> usual way, up to 16GB (4GB for the original RAM need, 4GB more for swapless
|> RAM space, and 8GB for /opt and /usr).
|>
|> I really shouldn't have to structure the system to avoid this issue. There
|> should be tuning knobs that allow me to reserve some portion of RAM (not a
|> classification of particular RAM locations) to be used for process pages
|> (e.g. program and library text pages), whether copied-on-write or not, that
|> cannot under any circumstances ever be forced out due to lots of memory
|> being used for write() buffering ...
|
| The memory is used for content-caching under the assumption that this
| content will again be needed in future.

That assumption is reasonable within limits. But certain things clearly
are not going to match that assumption. And these limits need to also be
tunable. Suppose you do need to read that 6.7G file back in. On a 512M
system, you can't expect to have it all cached no matter what. More likely
the _last_ part of the file is what is cached given the order of writing.
More likely than not if you read the file back, you read the beginning and
have to read from the drive. That reading is likely to steal the non-dirty
pages of the end of the file before it gets to that end. So what has been
gained by caching _part_ of a file that was written?

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2007-12-02-1436@xxxxxxxx |
|------------------------------------/-------------------------------------|
.



Relevant Pages

  • Re: running Linux with no swap space (but lots of RAM)
    ... Some amount of caching is necessary to achieve such I/O scheduling. ... The amount of memory allocated to ... Someone who thinks writing to /dev/null would result in lots of data being ... TB of RAM and fits in an ATX case using no more than 550 watts of power. ...
    (comp.os.linux.development.system)
  • Re: disk file reads slow down for file sizes greater than 2 GB
    ... the same one I've been chasing for several days now on a Windows XP ... the available RAM is reduced by the ... GlobalMemoryStatus at various points and writing the result to a text file. ... After all the available RAM is consumed by the write operations, memory ...
    (comp.lang.fortran)
  • Re: Audio skips when RAM is ~full
    ... > The kernel is buffering the contents of each directory that it ... > I understand that the idea is to stuff as much into RAM as possible to ... and that the kernel will reclaim memory utilized by ...
    (Linux-Kernel)
  • Re: RH 9 - Memory Question
    ... You cannot release the ram "that is obviously not being used" because it ... to use the memory, and it might as well use all of it. ... needs memory more urgently than Caching or Buffering, ... Start or Kill and are symbolic links to the actual scripts in ...
    (linux.redhat.misc)
  • Re: Committing Byte Array to Disk
    ... MB array in memory. ... I am writing to a 10K RPM drive, and have ample CPU ... and RAM. ...
    (microsoft.public.dotnet.languages.vb)