XFS bug or hard disk error ?

From: Rui Pedro Mendes Salgueiro (rps_at_rena.mat.uc.pt)
Date: 01/26/05


Date: 26 Jan 2005 16:25:39 -0000


One month ago (Dec 21 12:26:05 2004) I wrote:
> I have a server with Suse Linux 9.1 and a couple of disks with XFS
> filesystems. Lately it has been crashing with this or a similar error:

> Message from syslogd@xxx at Mon Dec 20 20:47:20 2004 ...
> xxx kernel: xfs_iget_core: ambiguous vns: vp/0xdedb9080, invp/0xf67c5980

> This error happens when a backup is being made, that is, when the
> whole disk tree is (probably) being traversed.

> It doesn't show any disk error on the console (it just freezes), but
> IIRC, the last time it happened soon the disk failed completely.

> I am thinking of replacing the disk tonight, but anyway I would like
> if someone can confirm that this error is due to an hardware error.

I replaced the disk, but the problems continued.

At the time I had googled for this problem but didn't find anything
relevant. Today I found the XFS web page
        http://linux-xfs.sgi.com/projects/xfs/

and searching on that page I found several persons with the same
problem and I concluded that this is an old bug that seems to be
caused by an interaction between XFS and NFS in a SMP kernel.
I suppose this is hard to track down and so is still not solved.
Or maybe it was solved only recently (I am using Suse 9.1).

On the xfs page I found this recipe to reproduce the problem
(I haven't tried yet):

http://bugme.osdl.org/show_bug.cgi?id=870

------- Additional Comment #4 From Robbie Williamson 2003-12-11 12:14 -------

How To Reproduce:
1) Download the latest LTP testsuite
        http://ltp.sf.net/nfs

2) Build and install the testsuite
3) Make sure the NFS server daemons are running
4) Export an XFS filesystem to be used for testing, globally, with root allowed.
   ex: /mnt/xfs *(sync,rw, no_root_squash)
5) Change directory to where the LTP is installed
6) Change directory to testcases/bin/
7) Execute 'nfs_fsstress.sh' and follow the prompts.
   a) Enter your hostname as the server
   b) Enter the export filesystem name, i.e. /mnt/xfs
   c) Enter "1" for the number of hours to execute.

The oops should occur within 30 minutes or so.

------- -------

> I will probably still use XFS, because there is no dump for reiserfs.

I now think that I will change to ext3. It is a pity because xfsdump
and xfsrestore are quite sophisticated.

-- 
http://www.mat.uc.pt/~rps/
.pt is Portugal| `Whom the gods love die young'-Menander (342-292 BC)
        Europe |    Villeneuve 50-82, Toivonen 56-86, Senna 60-94


Relevant Pages

  • Re: Making a bootable second hard disk (and larger filesystems)
    ... >filesystem on disk2 and it has to be bootable. ... >backup on second hard disk, ... under "root" and let both discs in the server? ... If you do only "dd" you may want to boot a "Knoppix" CD. ...
    (comp.unix.sco.misc)
  • Re: Redundant/failover NFS servers - stale NFS file handle
    ... but with a shared SCSI disk. ... the server side. ... At worst, the whole filesystem ... So to make an update, you would have to unmount from box 2, remount RW on ...
    (freebsd-net)
  • Re: Just want to share a Disk to 6 computers (no failover)
    ... four Windows 2008 Servers are running against this single Disk named D:\ ... The NTFS filesystem does not have any built-in ... I'm running a intel modular server (6 CPU Modules and a shared LUN / ... Physicaly it's working -- but Windows ...
    (microsoft.public.windows.server.clustering)
  • Re: Better filesystem idea
    ... > I manage AS/400 server. ... > filesystem usage and there is end of the free space on the disk. ... > because of free space shortage it cannot be written. ...
    (comp.os.linux.misc)
  • Re: JFS in Ubuntu
    ... I'd say XFS is a good ... XFS is, all around, the best performing filesystem.. ... XFS delays flushing data to disk for a long time and doesn't write ... nuke your settings/preferences in the case of an unclean shutdown. ...
    (Ubuntu)