Re: [PATCH] 2.6.10 - direct-io async short read bug

From: Badari Pulavarty (pbadari_at_us.ibm.com)
Date: 03/09/05

  • Next message: Greg KH: "[PATCH] USB: move usb core to use class_simple instead of it's own class functions."
    To: Andrew Morton <akpm@osdl.org>
    Date:	09 Mar 2005 14:45:57 -0800
    
    

    On Wed, 2005-03-09 at 14:39, Andrew Morton wrote:
    > Badari Pulavarty <pbadari@us.ibm.com> wrote:
    > >
    > > On Wed, 2005-03-09 at 11:53, Andrew Morton wrote:
    > > > Suparna Bhattacharya <suparna@in.ibm.com> wrote:
    > > > >
    > > > > > Solaris, which does forcedirectio as a mount option, actually
    > > > > > will do buffered I/O on the trailing part. Consider it like a bounce
    > > > > > buffer. That way they don't DMA the trailing data and succeed the I/O.
    > > > > > The I/O returns actual bytes till EOF, just like read(2) is supposed to.
    > > > > > Either this or a fully DMA'd number 4 is really what we should
    > > > > > do. If security can only be solved via a bounce buffer, who cares? If
    > > > > > the user created themselves a non-aligned file to open O_DIRECT, that's
    > > > > > their problem if the last part-sector is negligably slower.
    > > > >
    > > > > If writes/truncates take care of zeroing out the rest of the sector
    > > > > on disk, might we still be OK without having to do the bounce buffer
    > > > > thing ?
    > > >
    > > > We can probably rely on the rest of the sector outside i_size being zeroed
    > > > anyway. Because if it contains non-zero gunk then the fs already has a
    > > > problem, and the user can get at that gunk with an expanding truncate and
    > > > mmap() anyway.
    > > >
    > >
    > > Rest of the sector or rest of the block ?
    >
    > The filesystem-sized block (1<<i_blkbits) which straddles i_size should
    > have zeroes outside i_size.
    >
    > There's one situation where it might not be zeroed out, and that's when the
    > final page is mapped MAP_SHARED and the application modifies that page
    > outside i_size while writeout is actually in flight. We can't do much about
    > that.
    >
    > > Are you implying that, we
    > > already do this, so there is no problem reading beyond EOF to user
    > > buffer ? Or we need to zero out the userbuffer beyond EOF ?
    >
    > It should be acceptable to assume that the final (1<<i_blkbits) block of
    > the file contains zeroes outside i_size.
    >
    > And if it doesn't contain those zeroes, well, applications are able to read
    > that data already. Although I wouldn't count that as a security hole: that
    > data is something which an application wrote there while writing the file,
    > rather than being left-over uncontrolled stuff.

    Well, in that case - the original patch sent out is good enough to fix
    the problem. All the original patch did was after completing the IO,
    truncated the size to filesize. The problem is only with the last
    block in the file. If the file ends in the middle of the block, we
    go ahead and read till the end of the block. I was trying to address
    that issue. But, if the block is already zeroed, just truncating the
    size after the IO is complete should be good enough.

    Thanks,
    Badari

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Greg KH: "[PATCH] USB: move usb core to use class_simple instead of it's own class functions."

    Relevant Pages

    • Re: Uncertainty in mathematics
      ... truncating the tail of zeroes ... ... because that is how "decimals" were defined ... when one truncates the infinite tail of zeroes. ...
      (sci.math)
    • Re: Is there a "make hole" (truncate in middle) syscall?
      ... Don't ever write zeroes, seek ... Of course cat is sometimes used to write to things that aren't regular ... so make sure seek is supported and fall back to ordinary writing ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • /proc/<pid>/maps syntax
      ... Encourage better documentation of such formats ... We ended up with an inferior format in the long run. ... address and offset fields may or may not have leading zeroes, ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: md+ext3 badness in 2.6.0-test2
      ... > It looks like we had a block full of zeroes come back from the device ... > driver. ... I find it distinctly fishy how this happens so much with ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [PATCH 20/25] NTFS: Optimize fs/ntfs/aops.c::ntfs_write_block() by extending the page
      ... only zeroes. ... Fixup handling of sparse, compressed, and encrypted attributes in ... } while (bh!= head); ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)