Re: [opensuse] Help! RO File System Lock down OpenSUSE 10.2



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

LLLActive@xxxxxxx wrote:
On Wed 20 June 2007 00:56, Darryl Gregorash wrote:
On 06/19/2007 03:11 PM, LLLActive@xxxxxxx wrote:
Hi all,

I must be quick, before the system locks down again.
I recalled this problem was reported last month -- thread "File system
becomes read-only", to which you contributed. No solution was posted to
the list, no one mentioned anything about a bug report, and I cannot
find any bug report summaries that look even remotely close to the problem.

Yes, you're right. I have no solution yet, and the problem occurs on three
very different systems. I mentioned them in that theread as well. It is
becoming alarming now.

<snip>

You'll need to give us a lot more information about your system hardware
(including the modules that are loaded for hard drive i/o),.

Here some HW details:
1. A 5 year old system that had win2K on it for 3 years and since then
SuSE 9.3 and all the following. Here the problem occur with OpenSUSE
10.2. It has an ASUS mobo with ATI Radion graphic card.
2. A 2 year old system only had SuSE 10.0 that had the problem and now
has WinXP without any problems. It's a Gigabyte mobo with ATI Radion
graphic card. I noticed that intensive file access by Evolution caused a
systrem lockup many times.
3. The latest 1 year old system showed the problem mainly with SUSE 10.0
and now with OpenSUSE 10.2. An identical system has SuSE 10.1, where the
problem has till now not occured. It is a Gigabyte GA-K8N-SLi mobo with
nVidia GeForce 7600 GS.

The difference between the lockups of the SUSE 10.0 and OpenSUSE 10.2 is that
with 10.0 it did not allow any access to the system at all; a complete
lockdown - dead - only reset got it unlocked. The OpenSUSE 10.2 reports RO FS
problems by all applications. The system can be rebooted or shut down
normally.

plus
information from /var/log/messages about what is happening when the
filesystem goes RO

I noticed on two different systems this sort of messages
in /var/logs/messages:

Jun 2 22:15:03 kakalapap kernel: hda: task_no_data_intr: status=0x51
{ DriveReady SeekComplete Error }
Jun 2 22:15:03 kakalapap kernel: hda: task_no_data_intr: error=0x04
{ DriveStatusError }
Jun 2 22:15:03 kakalapap kernel: ide: failed opcode was: 0xef

For now, I need stability. I will follow Carl's suggestion and get all
partitions onto one FS type. I'm a little apprehensive to use ext3 or XFS.
Need I be? Patrick seems happy with ext3. Not sure what Carl uses.





I have had an on and off experience of this particular issue...

There seem to be three common elements for me...

i) Before the problem kicks in one started experiencing network
connections going into CLOSE_WAIT states...

ii) famd starts becoming a CPU hog. Files start being reported as been
use when not.

iii) commands and applications start reporting segment faults.

The last two are detectable if one is lucky enough to have an active
terminal session available. These are probably symptoms of something at
a lower level doing something it should not, but because of the I/O
lockdown it is nearly impossible to monitor system status to identify
what is at fault.

Oddly these conditions were most recently associated with a situation
with a couple of server end IMAP mail folders hitting about 7000
messages (laziness on my part). Thunderbird started timing out.
Restarting IMAP server clearing problem for a short while but after two
or three restarts system became unstable with above symptoms. Even after
full reboot was only a matter of time before whole thing locked up.

Since keeping folder sizes under control (i.e. under 7000 or so) this
has not happened (looking for forest to touch :-)), which also proves
nothing BTW (except possibly it is a failure of something to recover
from an error condition).

The problem first occurred soon after upgrading to 10.2 and seemed to be
initially associated with a dud tape drive. After disconnecting tape
drive I got stability for about 2-3 weeks before problem returned.

Some other things suggested to me that it may have not been the tape
drive initially at fault, and at some point I intend to conduct some
tests on the drive to determine whether it is really faulty (I dont have
a suitable test config at moment).

The problem occurs only on a dual opteron box with 64bit OS, not on my
32bit machine. There is no SCSI on box...

BTW This is a rieserFS only box.


- --
==============================================================================
I have always wished that my computer would be as easy to use as my
telephone.
My wish has come true. I no longer know how to use my telephone.

Bjarne Stroustrup
==============================================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGeOYDasN0sSnLmgIRAs12AKDa5XzS4nImxlP5CVBGyduIMItFUwCgvsdT
sbBYlv1n0ooPF6+/rSrKL70=
=RcZc
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx



Relevant Pages

  • Re: SuSE kernel source retrieval frustrations
    ... > It wasn't the user's fault. ... > offer me the sources from my install CDs; ... inaccurate knowledge -- but you've pointed out an area where SuSE ... resolution but *never* could get into X after that, ...
    (alt.os.linux.suse)
  • Re: [kde-linux] Konqueror file browser
    ... experience with konqueror on suse linux 10.0 -- i double ... clicked on a .jpg and it opened in gwenview -- after that, ... only an upgrade to suse ever made it act right again -- by ... Did you file a bug report with SuSE? ...
    (KDE)
  • Re: ALSA & Audigy2 vs newbie
    ... > reprobate, and dealer in pickles and dried meats, preached: ... It's not my fault if the file does not exist on your system - create it! ... SuSE - and thus requires them to be in this location if they were to be ... So far for the advice. ...
    (alt.os.linux.suse)
  • Re: [SLE] PROBLEM WITH TAPE HARDWARE COMPRESSION
    ... > I've got an HP SURESTORE DAT 40 on my SUSE LINUX ENTERPRISE SERVER 8 BOX ... > whereas on my ALPHA server with tru64 unix, I can save up to near 40 Gbytes ... You may have to use stinit from the mt_st rpm. ... Unfortunately the 10.0 kernel does not properly detect end of tape. ...
    (SuSE)
  • Re: First draft of aols FAQ
    ... In my first msg, I said to reply in ng if you thought it was appropriate ... the faq email rather than my own -- that way it gets better attention ... :-) I've never had to submit a bug report to SUSE, ...
    (alt.os.linux.suse)