huge bss segment and sbrk failure with rhel 32bit hugemem

From: Eric Taylor (et1_at_rocketship1.com)
Date: 08/11/05


Date: Thu, 11 Aug 2005 08:29:14 -0700

I am trying to push 32 bit linux (redhat hugemem) almost to the limit as
far as the use of virtual memory within a single process.

Sometimes it seems that sbrk is failing, but not consistently.

Here are the last two lines of /proc/pid/maps when it works, and sbrk returns
a value like fc7866a0.

08749000-fc787000 rw-p 00000000 00:00 0
fefea000-ff000000 rwxp fffeb000 00:00 0

I have a huge, (really huge, approx 3.8 gig) bss segment, using the
following code:

 asm(" .comm hugebss,0xf4000000,0x1000" );
 asm(" .comm hugebssz,0x1000,0x1000");

which accounts for the segment beginning at 08749000.

[ I am doing this because gcc will not allow me to allocate a single static
array over 2 gig. But the above is the equivalent to the code that
was generated (but smaller sizes). This part appears to work just fine. ]

I also have code that does an sbrk(0) then an sbrk(100000) and sbrk(0)
in sucession and sometimes the 3rd sbrk returns the same value as the first.
(when it works: orig= fc76e000 after expand = fc7866a0)

According to the docs on sbrk, this is just a wrapper for brk, and that
the only reason this should fail is if we are out of memory (or a bad
value given to brk, or the ulimit -d is getting in the way). Apparently, there
is only 1 error return value for any of the 3 possible reasons given.

Since sbrk is called with 100000, I don't think a bad value applies, the limit
showned with ulimit is unlimited, so the only remaining reason would be
that the system does not have enough memory.

But this failure of sbrk is occuring at the beginning of the program, so
the huge bss virtual memory has not been touched yet, so shouldn't
have any physical memory allocated for it yet. Isn't this
a sort of on demand segement, that does not need any physical pages until the
memory is actually first touched, and then a page is allocated and zeroed out?

(my systems all have 8 gigs of ram, and nothing else besides the normal set of
deamons was running when this failure occured - also 8 gigs of swap file)

I am running with /proc/version showing these settings:

Linux version 2.4.21-20.ELhugemem
(Red Hat Linux 3.2.3-20))
#2 SMP Mon Jan 3 1 09:28:50 PST 2005

Any ideas would be most appreciated.

Eric