Re: [PATCH] prune_icache_sb



On 11/30/06, Wendy Cheng <wcheng@xxxxxxxxxx> wrote:
How about a simple and plain change with this uploaded patch ....

The idea is, instead of unconditionally dropping every buffer associated
with the particular mount point (that defeats the purpose of page
caching), base kernel exports the "drop_pagecache_sb()" call that allows
page cache to be trimmed. More importantly, it is changed to offer the
choice of not randomly purging any buffer but the ones that seem to be
unused (i_state is NULL and i_count is zero). This will encourage
filesystem(s) to pro actively response to vm memory shortage if they
choose so.

From our end (cluster locks are expensive - that's why we cache them),
one of our kernel daemons will invoke this newly exported call based on
a set of pre-defined tunables. It is then followed by a lock reclaim
logic to trim the locks by checking the page cache associated with the
inode (that this cluster lock is created for). If nothing is attached to
the inode (based on i_mapping->nrpages count), we know it is a good
candidate for trimming and will subsequently drop this lock (instead of
waiting until the end of vfs inode life cycle).

I have a patch that is a more comprehensive version of this idea, but
it is not fully debugged, and has suffered some bitrot in the past
couple months. This turns out to be a good performance improvement in
the general case too, but is more complex than your idea because there
are real locking changes needed to avoid deadlocks. I can send you a
copy of the patch if you are interested.

Note that I could do invalidate_inode_pages() within our kernel modules
to accomplish what drop_pagecache_sb() does (without coming here to bug
people) but I don't have access to inode_lock as an external kernel
module. So either EXPORT_SYMBOL(inode_lock) or this patch ?

like i said above, you have to be careful when touching inode_lock,
dcache_lock, and the mapping's tree_lock, because of potential
deadlocks. the mapping's lock can be taken from softirq context, but
the inode and dcache locks cannot.

NATE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: review of ocfs2
    ... I'll make sure that gets noted in the proper documentation. ... Well the side effect here is that it grabbed the lock in which case we want ... The long story is that the truncate log inode always gets locked ...
    (Linux-Kernel)
  • Re: [PATCH] prune_icache_sb
    ... In Linux a filesystem is a dumb layer which sits between the VFS and the ... inode (that this cluster lock is created for). ...
    (Linux-Kernel)
  • Re: [PATCH] prune_icache_sb
    ... The idea is, instead of unconditionally dropping every buffer associated with the particular mount point, base kernel exports the "drop_pagecache_sb" call that allows page cache to be trimmed. ... It is then followed by a lock reclaim logic to trim the locks by checking the page cache associated with the inode. ... Note that I could do invalidate_inode_pageswithin our kernel modules to accomplish what drop_pagecache_sbdoes but I don't have access to inode_lock as an external kernel module. ...
    (Linux-Kernel)
  • Re: [patch 10/21] buffer heads: Support slab defrag
    ... Direct-reclaim is a hostile environment. ... Which basically means we can not do direct writeback at reclaim time?.. ... then wrote them out and which was careful not to touch the inode or ... Or perhaps add a new lock to the inode and then in reclaim ...
    (Linux-Kernel)
  • Re: Cleaning up vgone.
    ... I got rid of the XLOCK and simply require the vnode lock ... :VOP_RECLAIM, VOP_CLOSE, and VOP_REVOKE all require the vnode lock. ... of the inode in the inode hash table. ... inode was just allocated out of the inode bitmap. ...
    (freebsd-arch)