Re: Generate NMI to crash a hung system...



spike1@xxxxxxxxxxxxx wrote:
The Natural Philosopher <a@xxx> did eloquently scribble:
Have I missed anything out here? I did this on another couple of RHEL3
test boxes and got a lovely big vmcore file of about 4 gig on my
netdump server, but I'm getting nothing on the server I actually WANT a
crashdump from.

Thanks in advance - Lee


If its crashed that badly, it may well no longer be able to dump anything to a file system.

Indeed...
In fact, the only time I've seen that kind of crash (able type username and
password, which then locks) is when it's the filesystem or hard disk driver
itself that's crashed. The machine continues to perform functions that are
currently running in memory without needing to access the hard disk, but any
attempt to read or write to the disk results in that process entering a D
state (as shown in top/ps)

If you can connect a terminal emulator to the serial port and login there, I
think you can get any kernel oopses and panics directed to it rather than a
console you can't read or a file it can't write.

If this is the result of a filesystem/hard disk driver fault, I'd run your
hard disks through a disk tester too, it could be a hardware fault.

I think the fact that its doing it on more than one server negates that possibility - or renders it unlikely in the extreme.

As I said, my experience is usually that the process cannot FORK. Usually due to insufficent reserved kernel memory to support the header tables for a new process.

In the bad old days these were boot time parameters set up in config files: Since then a lot has been done to automagically remove the need for manual memory tuning. I suspect the OP is running something like database applications, or large server apps, which can result in huge process tables. However how one tunes a modern Linux setup I do not know. Later developments seem to be more about limiting processes to avoid DOS attacks (fork bombs) rather than ensuring a large multiuser app can actually run effectively.



.



Relevant Pages

  • Re: need a good idea for catch files deleted
    ... my ntree has ... modified in the system (not by my server) ... i can read all Hard disk and i can understand which files was created ... not permitted by your specifications. ...
    (comp.lang.c)
  • Re: HELP about this error
    ... Have you checked this at the server console? ... There are some tools available, i.e. NRAMe.exe, calculation the memory requirements for you following the standart formula of Novell. ... 30-35 users login with 5 files open each one. ... 750 Gb hard disk SCSI ...
    (de.comp.sys.novell)
  • Re: SBS2000 down and restore isnt working - help me please
    ... I take it the Server was using IDE hard disk, ... Do you have a Mirror or copy of the hard disk from the orig system? ... > any way to get the new installation of Exchange ...
    (microsoft.public.windows.server.sbs)
  • SBS 2003 Premium - Midnight Crash
    ... I have a server that crashes everynight around 00:27-00:30. ... Here is the log entry when I turn the server back on in the morning. ... The weird thing is that these codes indicate some kind of hard disk ... System files are on a raid 1 and company files on raid 5 through an adaptec ...
    (microsoft.public.windows.server.sbs)
  • SBS server crash everyday after midnight
    ... Here is the log entry when I turn the server back on in the morning. ... The weird thing is that these codes indicate some kind of hard disk ... System files are on a raid 1 and company files on raid 5 through an adaptec ... is based on the intel motherboard so I bought a motherboard to replace the ...
    (microsoft.public.windows.server.sbs)