Re: Unable to start new processes
- From: Chris MacDonald <chris@xxxxxxxxxxxxxxxxx>
- Date: Thu, 4 Nov 2010 17:20:42 -0700
Ok, sorry for the absolutely horrendous delay, but I've finally got a
few test machines set up to hopefully solve this.
See my original post for a description of the problem, it still
applies. I'm seeing this problem on multiple instances of the same
hardware. The machines are all D510MOs with 1GB of ram and a 4GB USB
flash drive that is host to Ubuntu, previously 10.04, but now 10.10
and the errors persist. I captured the error (pasted below) over the
serial port, but I'd seen it once before and it occurred at a
different sector. I've restarted the machine and I'm sure it will
crash again within a day or two. I'm also setting up another machine,
exact same hardware, I'll see if that fails too.
I'm starting to think this is a systemic hardware fault somewhere, but
if anyone knows their kernel debug-fu I'd be happy to give something a
try at my end to hopefully narrow the focus a bit.
[266929.048995] end_request: I/O error, dev sda, sector 776208
[266929.065740] Buffer I/O error on device sda1, logical block 96770
[266929.084033] Buffer I/O error on device sda1, logical block 96771
[266929.102321] Buffer I/O error on device sda1, logical block 96772
[266929.120692] end_request: I/O error, dev sda, sector 3490352
[266929.137686] Aborting journal on device sda1-8.
[266929.137744] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 8189
pages, ino 27933; err -30
[266929.137760] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 8168
pages, ino 27845; err -30
[266929.137770] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 8168
pages, ino 27951; err -30
[266929.137779] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 8168
pages, ino 27953; err -30
[266929.137787] EXT4-fs (sda1): ext4_da_writepages: jbd2_start: 8168
pages, ino 27956; err -30
[266929.276439] JBD2: I/O error detected when updating journal
superblock for sda1-8.
[266929.276542] EXT4-fs error (device sda1): ext4_journal_start_sb:
Detected aborted journal
[266929.276555] EXT4-fs (sda1): Remounting filesystem read-only
[266929.340709] journal commit I/O error
[266930.839068] EXT4-fs error (device sda1): ext4_find_entry: inode
#32023: (comm java) reading directory lblock 0
[266933.157820] sd 3:0:0:0: [sdb] Assuming drive cache: write through
[266933.176765] EXT4-fs error (device sda1): ext4_find_entry: inode
#312: (comm rsyslogd)
[266933.178520] sd 3:0:0:0: [sdb] Assuming drive cache: write through
[266933.222163] sd 3:0:0:0: [sdb] Assuming drive cache: write through
[266933.223117] EXT4-fs error (device sda1): ext4_find_entry: inode
#338: (comm udevd) reading directory lblock 0
[266971.032873] EXT4-fs error (device sda1): ext4_find_entry: inode
#32259: (comm postmaster) reading directory lblock 0
[266971.067016] EXT4-fs error (device sda1): ext4_find_entry: inode
#32259: (comm postmaster)
[266971.068947] EXT4-fs error (device sda1): ext4_find_entry: inode
#312: (comm rsyslogd) reading directory lblock 0
[266971.069077] EXT4-fs error (device sda1): ext4_find_entry: inode
#312: (comm rsyslogd) reading directory lblock 0
[266971.156148] EXT4-fs error (device sda1): ext4_find_entry: inode
#32018: (comm postmaster)
[266971.157334] EXT4-fs error (device sda1): ext4_find_entry: inode
#312: (comm rsyslogd) reading directory lblock 0
[266971.212704] EXT4-fs error (device sda1): ext4_find_entry: inode
#32259: (comm postmaster) reading directory lblock 0
[266973.229713] EXT4-fs error (device sda1): ext4_find_entry: inode
#2: (comm cron) reading directory lblock 0
[266973.259137] EXT4-fs error (device sda1): ext4_find_entry: inode
#6090: (comm cron) reading directory lblock 0
[267024.440346] EXT4-fs error (device sda1): ext4_find_entry: inode
#1611: (comm java) reading directory lblock 0
[267024.470643] EXT4-fs error (device sda1): ext4_find_entry: inode
#31906: (comm java) reading directory lblock 0
[267084.518307] EXT4-fs error (device sda1): ext4_find_entry: inode
#144807: (comm java) reading directory lblock 0
[267084.548999] EXT4-fs error (device sda1): ext4_find_entry: inode
#144802: (comm java) reading directory lblock 0
[267084.579665] EXT4-fs error (device sda1): ext4_find_entry: inode
#144801: (comm java) reading directory lblock 0
[267525.944731] EXT4-fs error (device sda1): ext4_find_entry: inode
#12: (comm ntpd) reading directory lblock 0
[270010.787208] EXT4-fs error (device sda1): ext4_find_entry: inode
#12: (comm ntpd) reading directory lblock 0
[270811.644198] EXT4-fs error (device sda1): ext4_find_entry: inode
#32245: (comm java) reading directory lblock 0
On Tue, Aug 24, 2010 at 5:01 AM, Hakan Koseoglu <hakan@xxxxxxxxxxxx> wrote:
Chris,
On 24 August 2010 12:09, Karl Larsen <klarsen1@xxxxxxxxx> wrote:
On 08/23/2010 09:31 PM, Chris MacDonald wrote:First let's take care of this.
It appears you have a problem with ssh. Please give details onWrong, wrong, so wrong it's stupid.
how you have set up ssh. You should have zero problems using Tomcat on
the remote machine.
It looks like you cannot spawn any new processes. This can happenFrom my machine the problem manifests itself as an inability to
request much in the way of data from the remote machine, for instance,
when I SSH in (ssh -v) it opens a connection, attempts to negotiate a
session (I get a response from the remote machine), but then promptly
closes the connection remotely before I get prompted for a password.
Likewise for the running instance of Tomcat, I'll connect to the http
port, it will accept my connection, but before I get anything back it
closes the connection on me. I can ping the remote machine, it shows
ports as open, I just can't seem to get any data.
because of a couple of main reasons. First being the ulimits being
reached. Typical Ubuntu installation does not have any limits on the
amount of memory & processes a user can consume. You can check the
limits by executing "ulimit -a". With the information given this
sounds like a memory leak where the server is starved and any new
processes are being killed. One other possibility is breaching the max
amount of open files. You can use various tools to check these. My
favourite is nmon, you can also use sar for checking cpu usage stats.
The best action is figuring out what's running on your server and how
do they behave as time goes. Nmon's capacity planning will give you
the necessary overview although you might like to collect more data.
One other thing to check is if your applications are consuming too
many ports! You might like to have a look at
net.ipv4.ip_local_port_range configuration you have. Regardless, This
is usually quite a high range, if this is happening, you have an other
problem like your processes not closing their ports after in use.
Reducing the amount of memory allocated to Tomcat might be a starting
point since that's the process most likely ballooning and leaking.
Also look for OOM killer in the message files.
--
Hakan (m1fcj) - http://www.hititgunesi.org
--
ubuntu-users mailing list
ubuntu-users@xxxxxxxxxxxxxxxx
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
--
ubuntu-users mailing list
ubuntu-users@xxxxxxxxxxxxxxxx
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
- Follow-Ups:
- Re: Unable to start new processes
- From: Rashkae
- Re: Unable to start new processes
- From: Chris MacDonald
- Re: Unable to start new processes
- Prev by Date: Re: questions about installing new kernel
- Next by Date: Re: set FQDN for hostname
- Previous by thread: VNCviewer flickering
- Next by thread: Re: Unable to start new processes
- Index(es):