[PATCH] fsstack: fsstack_copy_inode_size locking



LTP's iogen01 doio tests used to hang nicely on 32-bit SMP when /tmp was a
unionfs mount of a tmpfs, i_size_read spinning forever, waiting for a lost
seqcount update: fixed by taking i_lock around i_size_write when 32-bit SMP.

But akpm was dissatisfied with the resulting patch: its lack of commentary,
the #ifs, the nesting around i_size_read, the lack of attention to i_blocks.
I promised to redo it with the general spin_lock_32bit() he proposed; but
disliked the result, partly because "32bit" obscures the real constraints,
which are best commented within fsstack_copy_inode_size itself.

This version adds those comments, and uses sizeof comparisons which the
compiler can optimize out, instead of CONFIG_SMP, CONFIG_LSF. BITS_PER_LONG.

Signed-off-by: Hugh Dickins <hugh@xxxxxxxxxxx>
---
Should follow mmotm's git-unionfs-fixup.patch

fs/stack.c | 58 +++++++++++++++++++++++++++++++------
include/linux/fs_stack.h | 3 -
2 files changed, 50 insertions(+), 11 deletions(-)

--- mmotm/fs/stack.c 2008-06-27 13:39:18.000000000 +0100
+++ linux/fs/stack.c 2008-06-27 18:24:19.000000000 +0100
@@ -19,16 +19,56 @@
* This function cannot be inlined since i_size_{read,write} is rather
* heavy-weight on 32-bit systems
*/
-void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
+void fsstack_copy_inode_size(struct inode *dst, struct inode *src)
{
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
- spin_lock(&dst->i_lock);
-#endif
- i_size_write(dst, i_size_read(src));
- dst->i_blocks = src->i_blocks;
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
- spin_unlock(&dst->i_lock);
-#endif
+ loff_t i_size;
+ blkcnt_t i_blocks;
+
+ /*
+ * i_size_read() includes its own seqlocking and protection from
+ * preemption (see include/linux/fs.h): we need nothing extra for
+ * that here, and prefer to avoid nesting locks than attempt to
+ * keep i_size and i_blocks in synch together.
+ */
+ i_size = i_size_read(src);
+
+ /*
+ * But if CONFIG_LSF (on 32-bit), we ought to make an effort to keep
+ * the two halves of i_blocks in synch despite SMP or PREEMPT - though
+ * stat's generic_fillattr() doesn't bother, and we won't be applying
+ * quotas (where i_blocks does become important) at the upper level.
+ *
+ * We don't actually know what locking is used at the lower level; but
+ * if it's a filesystem that supports quotas, it will be using i_lock
+ * as in inode_add_bytes(). tmpfs uses other locking, and its 32-bit
+ * is (just) able to exceed 2TB i_size with the aid of holes; but its
+ * i_blocks cannot carry into the upper long without almost 2TB swap -
+ * let's ignore that case.
+ */
+ if (sizeof(i_blocks) > sizeof(long))
+ spin_lock(&src->i_lock);
+ i_blocks = src->i_blocks;
+ if (sizeof(i_blocks) > sizeof(long))
+ spin_unlock(&src->i_lock);
+
+ /*
+ * If CONFIG_SMP on 32-bit, it's vital for fsstack_copy_inode_size()
+ * to hold some lock around i_size_write(), otherwise i_size_read()
+ * may spin forever (see include/linux/fs.h). We don't necessarily
+ * hold i_mutex when this is called, so take i_lock for that case.
+ *
+ * And if CONFIG_LSF (on 32-bit), continue our effort to keep the
+ * two halves of i_blocks in synch despite SMP or PREEMPT: use i_lock
+ * for that case too, and do both at once by combining the tests.
+ *
+ * There is none of this locking overhead in the 64-bit case.
+ */
+ if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
+ spin_lock(&dst->i_lock);
+ i_size_write(dst, i_size);
+ dst->i_blocks = i_blocks;
+ if (sizeof(i_size) > sizeof(long) || sizeof(i_blocks) > sizeof(long))
+ spin_unlock(&dst->i_lock);
}
EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);

--- mmotm/include/linux/fs_stack.h 2008-06-27 13:39:19.000000000 +0100
+++ linux/include/linux/fs_stack.h 2008-06-27 14:08:00.000000000 +0100
@@ -21,8 +21,7 @@

/* externs for fs/stack.c */
extern void fsstack_copy_attr_all(struct inode *dest, const struct inode *src);
-extern void fsstack_copy_inode_size(struct inode *dst,
- const struct inode *src);
+extern void fsstack_copy_inode_size(struct inode *dst, struct inode *src);

/* inlines */
static inline void fsstack_copy_attr_atime(struct inode *dest,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Breaking Bad--Aw, poor, poor private domicile! ("Sunset" spoilers)
    ... What few cops they have in that small ... show up instantly while downplaying the waiting. ... they ratchet up the tension, the tension, the tension. ... just like LOST, the fans come out of the woodwork to try and defend ...
    (rec.arts.tv)
  • Re: Still waiting for Bold for Delphi 200x
    ... I'm happy with the overall direction of Delphi right now. ... and Win32 in the same environment. ... Soon it appears C++ if I lost ... I'm not waiting for a Win64 compiler ...
    (borland.public.delphi.non-technical)
  • Re: VA Care for Chronic Pain
    ... waiting for ss disability 30 months now va sucks but it s free ... the third attempt I'm using a lawyer. ... You've been waiting a bit longer than I have I'm at ... they've lost homes and just so much just waiting. ...
    (alt.support.chronic-pain)
  • Re: NiGHTS needs help!
    ... now the first season of Lost is over on British TV I've ... > decided bollocks to waiting for the next season to come around and so ...
    (rec.sport.pro-wrestling)
  • Re: Avoiding LVM -
    ... and wonder that it is gone after tmpwatch has cleand /tmp ... or with F18 defaults you have it even as tmpfs nd lost at ...
    (Fedora)