Re: FIO: kjournald blocked for more than 120 seconds



On Tue, Jun 17 2008, Lin Ming wrote:

On Tue, 2008-06-17 at 10:36 +0200, Jens Axboe wrote:
On Tue, Jun 17 2008, Zhang, Yanmin wrote:
-----Original Message-----
From: Jens Axboe [mailto:jens.axboe@xxxxxxxxxx]
Sent: Tuesday, June 17, 2008 3:30 AM
To: Lin, Ming M
Cc: Zhang, Yanmin; Linux Kernel Mailing List
Subject: Re: FIO: kjournald blocked for more than 120 seconds

On Mon, Jun 16 2008, Lin Ming wrote:
Hi, Jens

When runnig FIO benchmark, kjournald blocked for more than 120
seconds.
Detailed root cause analysis and proposed solutions as below.

Any comment is appreciated.

Hardware Environment
---------------------
13 SEAGATE ST373307FC disks in a JBOD, connected by a Qlogic ISP2312
Fibe Channel HBA.

Bug description
----------------
fio vsync random read 4K in 13 disks, 4 processes per disk, fio
global
paramter as below,
Tested 4 IO schedulers, issue is only seen in CFQ.

INFO: task kjournald:20558 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kjournald D ffff810010820978 6712 20558 2
ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2
ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb
0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537
Call Trace:
[<ffffffff803ba6f2>] kobject_get+0x12/0x17
The disks of my testing machine are tagged devices, so the CFQ idle
window is disabled. In other words, the active queue of tagged
devices(cfqd->hw_tag=1) never idle for a new request.

This causes active queue be expired immediately if it's empty,
although
it has not run out of time. CFQ will select next queue as active
queue.
In this testcase, there are thousands of FIO read requests in sync
queues, only a few write requests by journal_write_commit_record in
async queues.

In the other hand, all processes use the default io class and
priority.
They share the async queue for the same device, but have their own
sync
queue, so the sync queue number is 4 while asyn queue number is just
1
for the same device.

So sync queue has much more chances be selected as new active queue
than
async queue.

Sync queues do not idle and they are dispatched all the time. This
leads
to many unfinished requests in external queue,
namely, cfqd->sync_flight > 0.

static int cfq_dispatch_requests (...) {
....
while ((cfqq = cfq_select_queue(cfqd)) != NULL) {
....
if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq))
break;
....
__cfq_dispatch_requests(cfqq)
}
....
}

When cfq_select_queue selects the async queue which includes
kjournald's
write request, this selected async queue will never be dispatched
since
cfqd->sync_flight > 0, so kjournald is blocked.

Proposed 3 solutions
------------------
1. Do not check cfqd->sync_flight

- if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq))
- break;

2. If we do need to check cfqd->sync_flight, then for tagged
devices, we
should give a little more chances to async queue to be dispatched.

@@ -1102,7 +1102,7 @@ static int cfq_dispatch_requests(struct
request_queue *q, int force)
break;
}

- if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq))
+ if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq) && !
cfqd->hw_tag)
break;

3. Force write request issued by journal_write_commit_record as sync
request. As a matter of fact, it looks like most write requests
submitted by kjournald is async request. We need convert them to
sync
requests.

Thanks for the very detailed analysis of the problem, complete with
suggestions. While I think that any code that does:

submit async io
wait for it

should be issuing sync IO (or, better, automatically upgrade the
request
from async -> sync), we cannot rely on that.
[YM] We can talk case by case. We could convert some important async io
codes
to sync io codes at least. For example, kjournald calls
sync_dirty_buffer what
we captured in this case.

I agree, we should fix the obvious cases. My point was merely that there
will probably always be missed cases, so we should attempt to handle it
in the scheduler as well. Does the below buffer patch make it any
better?

Yes, kjournald blocked issue is gone with below patch applied.

I think it's obviously the right thing to do, but I'm also a bit worried
about applying it so close to 2.6.26 release. OTOH, we need to do
SOMETHING for 2.6.26 release, so...


Lin Ming


Another case is writeback. If processes do mmapped I/O and they might
stop in
page fault to wait writeback finishing. Or a buffer write might trigger
a dirty
page balance. As the latest kernel is more aggressive to start
writeback, it might
be an issue now.

Sync process getting stuck in async writeout is another problem of the
same variety.

diff --git a/fs/buffer.c b/fs/buffer.c
index a073f3f..1957a8f 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2978,7 +2978,7 @@ int sync_dirty_buffer(struct buffer_head *bh)
if (test_clear_buffer_dirty(bh)) {
get_bh(bh);
bh->b_end_io = end_buffer_write_sync;
- ret = submit_bh(WRITE, bh);
+ ret = submit_bh(WRITE_SYNC, bh);
wait_on_buffer(bh);
if (buffer_eopnotsupp(bh)) {
clear_buffer_eopnotsupp(bh);



--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: FIO: kjournald blocked for more than 120 seconds
    ... So sync queue never starve async ... percentages of slice that will be decreased. ... The reason we don't dispatch async with sync in flight, ...
    (Linux-Kernel)
  • Re: FIO: kjournald blocked for more than 120 seconds
    ... So sync queue never starve async ... percentages of slice that will be decreased. ... The reason we don't dispatch async with sync in flight, ...
    (Linux-Kernel)
  • Re: FIO: kjournald blocked for more than 120 seconds
    ... When runnig FIO benchmark, kjournald blocked for more than 120 ... This causes active queue be expired immediately if it's empty, ... Sync queues do not idle and they are dispatched all the time. ... When cfq_select_queue selects the async queue which includes ...
    (Linux-Kernel)
  • Re: FIO: kjournald blocked for more than 120 seconds
    ... When runnig FIO benchmark, kjournald blocked for more than 120 ... This causes active queue be expired immediately if it's empty, ... Sync queues do not idle and they are dispatched all the time. ... When cfq_select_queue selects the async queue which includes ...
    (Linux-Kernel)
  • RE: FIO: kjournald blocked for more than 120 seconds
    ... When runnig FIO benchmark, kjournald blocked for more than 120 ... This causes active queue be expired immediately if it's empty, ... Sync queues do not idle and they are dispatched all the time. ... Force write request issued by journal_write_commit_record as sync ...
    (Linux-Kernel)