Re: 2.6.19 file content corruption on ext3



On Sun, 2006-12-17 at 15:40 -0800, Andrew Morton wrote:
On Sun, 17 Dec 2006 15:39:32 +0200
Andrei Popa <andrei.popa@xxxxxxxx> wrote:

I was mistaken, I'm still having file corruption with rtorrent.


Well I'm not very optimistic, but if people could try this, please...



From: Andrew Morton <akpm@xxxxxxxx>

try_to_free_buffers() clears the page's dirty state if it successfully removed
the page's buffers.

Background for this:

- a process does a one-byte-write to a file on a 64k pagesize, 4k
blocksize ext3 filesystem. The page is now PageDirty, !PgeUptodate and
has one dirty buffer and 15 not uptodate buffers.

- kjournald writes the dirty buffer. The page is now PageDirty,
!PageUptodate and has a mix of clean and not uptodate buffers.

- try_to_free_buffers() removes the page's buffers. It MUST now clear
PageDirty. If we were to leave the page dirty then we'd have a dirty, not
uptodate page with no buffer_heads.

We're screwed: we cannot write the page because we don't know which
sections of it contain garbage. We cannot read the page because we don't
know which sections of it contain modified data. We cannot free the page
because it is dirty.


How about we stick something like this on top of that patch. It should
preserve the dirty state as required.

I tried to tinker with avoiding the clear/set thing but could not
convince myself it was close to safe.

This should be safe; page_mkclean walks the rmap and flips the pte's
under the pte lock and records the dirty state while iterating.
Concurrent faults will either do set_page_dirty() before we get around
to doing it or vice versa, but dirty state is not lost.

---
mm/page-writeback.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6-git/mm/page-writeback.c
===================================================================
--- linux-2.6-git.orig/mm/page-writeback.c 2006-12-18 17:24:41.000000000 +0100
+++ linux-2.6-git/mm/page-writeback.c 2006-12-18 17:26:56.000000000 +0100
@@ -872,8 +872,9 @@ int test_clear_page_dirty(struct page *p
* page is locked, which pins the address_space
*/
if (mapping_cap_account_dirty(mapping)) {
- if (must_clean_ptes)
- page_mkclean(page);
+ int cleaned = page_mkclean(page);
+ if (!must_clean_ptes && cleaned)
+ set_page_dirty(page);
dec_zone_page_state(page, NR_FILE_DIRTY);
}
return 1;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: 2.6.19 file content corruption on ext3
    ... Why is it EVER correct to clear dirty bits except JUST BEFORE THE IO? ... has one dirty buffer and 15 not uptodate buffers. ... The page is now PageDirty, ... !PageUptodate and has a mix of clean and not uptodate buffers. ...
    (Linux-Kernel)
  • Re: 2.6.19 file content corruption on ext3
    ... Why is it EVER correct to clear dirty bits except JUST BEFORE THE IO? ... has one dirty buffer and 15 not uptodate buffers. ... The page is now PageDirty, ... !PageUptodate and has a mix of clean and not uptodate buffers. ...
    (Linux-Kernel)
  • Re: Dirty property not changing?
    ... grep, that is true. ... a calculated control is not bound to the record so ... the dirty state. ...
    (microsoft.public.access.forms)
  • Re: 2.6.19 file content corruption on ext3
    ... under the pte lock and records the dirty state while iterating. ... The case that doesn't want to really clear the pte dirty bits. ... while caring for dirty memory accounting. ...
    (Linux-Kernel)
  • Re: Ok, explained.. (was Re: [PATCH] mm: fix page_mkclean_one)
    ... pte may be dirty but the page dirty state not accounted for. ... just might not write it out in a timely manner if we ever hit the race. ... I don't think this would have been nearly as confusing. ...
    (Linux-Kernel)