Ubuntu ext4 file system hangs/slow at random times



Hello folks -

I have an unusual problem on some Ubuntu VMs that seems to happen randomly.
I've seen it on maybe 3-4 different VMs at this point, on about 8 occurrences
in the past couple of months. The only solution out of the issue is to power
cycle the VM.

One of the symptoms is very high system cpu usage with low I/O usage.

Software: Ubuntu 10.04.3 64-bit (2.6.32-37-server)
File system: ext4 w/LVM

Hypervisor: ESX 4.1 Update 2
Storage: the volume can either be a VMFS volume or a raw device mapped LUN
from our fibre channel SAN (3PAR F200), all
storage is thin provisioned.

Drivers: Using the paravirtualized SCSI adapters in ESX

The behavior is that at a random time the ext4 file system seems to get stuck.
Any process accessing the file system gets really slow access, and gets stuck
in a 'D' state. Underlying I/O performance is good with both service times
and average wait times under 1 millisecond. In one situation at least I tried
to do a tail on a sub 100 byte file during this sort of behavior and it hung
immediately.

Kernel dumps out tons of messages saying things were waiting longer than 120
seconds to run.

I have spent a least a couple hours searching on this topic but have not found
much information. I enabled ext4 event tracing in the
/sys/kernel/debug/tracing/set_event and those events are here:

http://yehat.aphroland.org/files/ubuntu/

I also put a snapshot of iostat running too.

I/O activity that has triggered this has varied:

- First time I saw it was when I was doing a parallel rsync of a few hundred gigs
of data. I thought I worked around it by disabling file system barriers.
I noticed that at least in the RHEL 6.0 technical notes they require barriers.
to be disabled when running enterprise storage (which we are). Our storage has
mirrored cache and is battery backed.
This volume was a raw device map.
- Basic log rotation of medium sized log files from a VMFS-based ext4 volume to
a NFS volume
- RRDtool activity from cacti to a raw device mapped volume
- Basic Splunk search/indexer activity(what I saw today).

3 different systems that I can think of off the top of my head. In every case the
VM in question has at least two different virtual disks, and only one of the
virtual disks is affected. The other one (root partition) is not.

Now that I know disabling barriers doesn't help I have moved two of the three
systems to be ext3 instead didn't re-format just remounted). I'm not sure whether
or not Ubuntu uses barriers by default on ext3 as well or just ext4.

If the solution is to stick to ext3 I have no problem with that - nothing in
ext4 that I need that I can think of really.

Though it would be nice if there was some fix to the issue.

I have about 150 VMs spread over 8 hosts connected to the same storage. All of
the VMs are managed the same way so they all have the same software and stuff.

If there is any other debug data I could gather that would be useful next
time this happens (assuming ext3 didn't fix it), please let me know.

This is my first experience with Ubuntu + ext4 + LVM on an enterprise storage
array. Not that I expected much of a different experience from CentOS/RHEL
(v4 and v5) + ext3 + LVM on the same array technology which I had been using
for years w/o issue.

thanks

nate


--
ubuntu-users mailing list
ubuntu-users@xxxxxxxxxxxxxxxx
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users



Relevant Pages

  • Re: ubuntu-users Digest, Vol 78, Issue 133
    ... File system for MySQL server on Ubuntu ... I thought about XFS or EXT4, because ext4 filesystem can support volumes ...
    (Ubuntu)
  • Re: About Frank & Peter
    ... means Long Term Support and ext4 is the new file system which, like the old, needs no defragging. ... With the new ext4, Ubuntu is much faster when being used and boots up in less than 30 seconds. ... I've told you over and over that I installed that INFERIOR OS Ubuntu about a year ago and decided it was SHIT. ... That gives you more time to decide to uninstall that piece of crap and install Windows. ...
    (microsoft.public.windows.vista.general)
  • Re: About Frank & Peter
    ... Thanks for proving that you're completely clueless regarding Ubuntu. ... means Long Term Support and ext4 is the new file system which, ... So they have a new file system. ...
    (microsoft.public.windows.vista.general)
  • ext3 to ext4 ,how?
    ... i use ubuntu 9.04 update from ubuntu 8.10,so the file system is ext3,is ... there any way change the partition mounted the / to ext4 without a totally ... ps:my english is not good,did i made my question clear enough? ...
    (Ubuntu)
  • Re: HP wins Oracle Itanium case
    ... and the more I work with VMS the more I find I have to write stuff. ... Short of some Relational databases, is there any file system anywhere that does this? ... It'd mean folks wouldn't have to either do a whole lot of RMS coding, or take applications and files offline for file-packing maintenance. ...
    (comp.os.vms)