Re: [RFC] JBD ordered mode rewrite
- From: Mingming Cao <cmm@xxxxxxxxxx>
- Date: Fri, 07 Mar 2008 02:55:01 -0800
On Thu, 2008-03-06 at 18:42 +0100, Jan Kara wrote:
Hi,Hi Jan,
Below is my rewrite of ordered mode in JBD. Now we don't have a list of
data buffers that need syncing on transaction commit but a list of inodes
that need writeout during commit. This brings all sorts of advantages such
as possibility to get rid of journal heads and buffer heads for data
buffers in ordered mode, better ordering of writes on transaction commit,
simplification of some JBD code, no more anonymous pages when truncate of
data being committed happens. The patch has survived some light testing
but it still has some potential of eating your data so beware :) I've run
dbench to see whether we didn't decrease performance by different handling
of truncate and the throughput I'm getting on my machine is the same (OK,
is lower by 0.5%) if I disable the code in truncate waiting for commit to
finish... Also the throughput of dbench is about 2% better with my patch
than with current JBD.
I know ext4 is keep changing that it's a bit hard to create patch
against ext4, but I feel features like especially rewrite the default
ordered mode should done in ext4/jbd2. I could port to current ext4 and
JBD2 if you agrees with this.
Also, would it make sense to create a new ordered mode writepage
routines, and keep the old ordered mode code there for a while, to allow
easy comparison? This could a good transition for people to start
experiment this ordered mode without worrying about put data in danger
by default.
Any comments or testing most welcom
[...]
/*
* Note that we always start a transaction even if we're not journalling
* data. This is to preserve ordering: any hole instantiation within
@@ -1465,15 +1444,11 @@ static int journal_dirty_data_fn(handle_t *handle, struct buffer_head *bh)
* We don't honour synchronous mounts for writepage(). That would be
* disastrous. Any write() or metadata operation will sync the fs for
* us.
- *
- * AKPM2: if all the page's buffers are mapped to disk and !data=journal,
- * we don't need to open a transaction here.
*/
static int ext3_ordered_writepage(struct page *page,
struct writeback_control *wbc)
{
struct inode *inode = page->mapping->host;
- struct buffer_head *page_bufs;
handle_t *handle = NULL;
int ret = 0;
int err;
@@ -1487,46 +1462,49 @@ static int ext3_ordered_writepage(struct page *page,
if (ext3_journal_current_handle())
goto out_fail;
- handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
-
- if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- goto out_fail;
+ /*
+ * Now there are two different reasons why we can be called:
+ * 1) write out during commit
+ * 2) fsync / writeout to free memory
+ *
+ * In the first case, we just need to write the buffer to disk, in the
+ * second case we may need to do hole filling and attach the inode to
+ * the transaction. Note that even in the first case, we may get an
+ * unmapped buffer (hole fill with data via mmap) but we don't have to
+ * write it - actually, we can't because from a transaction commit we
+ * cannot start a new transaction or we could deadlock.
+ */
Any thoughts how to handle the unmapped page under case 1)? Right now I
see it fails. Your comments here saying that we still have the issue
that "can't start a new transaction while commiting", but likely, with
delayed allocation, starting a new transaction could to happen a lot to
do defered block allocation.
I really hope this new mode could be easy to add delayed allocation
support. Any thoughts that we could workaround the locking in the JBD2
layer?
+ if (!wbc->skip_unmapped) {
+ handle = ext3_journal_start(inode,
+ ext3_writepage_trans_blocks(inode));
+ if (IS_ERR(handle)) {
+ ret = PTR_ERR(handle);
+ goto out_fail;
+ }
}
+ else if (!PageMappedToDisk(page))
+ goto out_fail;
Thanks & Regards,
Mingming
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [RFC] JBD ordered mode rewrite
- From: Jan Kara
- Re: [RFC] JBD ordered mode rewrite
- References:
- [RFC] JBD ordered mode rewrite
- From: Jan Kara
- [RFC] JBD ordered mode rewrite
- Prev by Date: Re: [PATCH] [6/13] Core maskable allocator
- Next by Date: Re: [BUG 2.6.25-rc3] scheduler/hotplug: some processes are dealocked when cpu is set to offline
- Previous by thread: Re: [RFC] JBD ordered mode rewrite
- Next by thread: Re: [RFC] JBD ordered mode rewrite
- Index(es):
Relevant Pages
|