Re: RFC: Cleanup / small fixes to hugetlb fault handling

From: Adam Litke (agl_at_us.ibm.com)
Date: 10/26/05

  • Next message: Richard Knutsson: "[PATCH 2.6.14-rc5] drivers/net/dgrs.c: Fix potential "unused variable"-warning."
    To: David Gibson <david@gibson.dropbear.id.au>
    Date:	Wed, 26 Oct 2005 16:46:16 -0500
    
    

    On Wed, 2005-10-26 at 12:48 +1000, David Gibson wrote:
    > On Wed, Oct 26, 2005 at 12:00:55PM +1000, David Gibson wrote:
    > > Hi, Adam, Bill, Hugh,
    > >
    > > Does this look like a reasonable patch to send to akpm for -mm.
    >
    > Ahem. Or rather this version, which actually compiles.
    >
    > This patch makes some slight tweaks / cleanups to the fault handling
    > path for huge pages in -mm. My main motivation is to make it simpler
    > to fit COW in, but along the way it addresses a few minor problems
    > with the existing code:
    >
    > - The check against i_size was duplicated: once in
    > find_lock_huge_page() and again in hugetlb_fault() after taking the
    > page_table_lock. We only really need the locked one, so remove the
    > other.

    Fair enough.

    > - find_lock_huge_page() didn't, in fact, lock the page if it newly
    > allocated one, rather than finding it in the page cache already. As
    > far as I can tell this is a bug, so the patch corrects it.

    Thanks. I was about to post a fix for this too. It is reproducible in
    the case where two threads race in the fault handler and both do
    alloc_huge_page(). In that case, the loser will fail to insert his page
    into the page cache and will call put_page() which has a
    BUG_ON(page_count(page) == 0).

    > - find_lock_huge_page() isn't a great name, since it does extra things
    > not analagous to find_lock_page(). Rename it
    > find_or_alloc_huge_page() which is closer to the mark.

    I'll agree with the above. I am not all that committed to the current
    layout and what you have here is a little closer to the thinking in my
    original patch ;)

    <snip>

    > +int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
    > + unsigned long address, int write_access)
    > +{
    > + pte_t *ptep;
    > + pte_t entry;
    > +
    > + ptep = huge_pte_alloc(mm, address);
    > + if (! ptep)
    > + /* OOM */
    > + return VM_FAULT_SIGBUS;
    > +
    > + entry = *ptep;
    > +
    > + if (pte_none(entry))
    > + return hugetlb_no_page(mm, vma, address, ptep);
    > +
    > + /* we could get here if another thread instantiated the pte
    > + * before the test above */
    > +
    > + return VM_FAULT_SIGBUS;
    > }

    I'll agree with Ken that the last return should probably still be
    VM_FAULT_MINOR.

    -- 
    Adam Litke - (agl at us.ibm.com)
    IBM Linux Technology Center
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Richard Knutsson: "[PATCH 2.6.14-rc5] drivers/net/dgrs.c: Fix potential "unused variable"-warning."

    Relevant Pages

    • Re: Prezeroing V2 [0/3]: Why and When it works
      ... Clear_page is used both in the cache hot and no cache wanted case now. ... > more memory usage to specific applications that do very sparse ... Gang faulting is not part of this patch. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: [patch] sched: auto-tuning task-migration
      ... >>of the experiments are done with Ingo's patch posted earlier. ... > it measures migration costs and sets up cache_hot value accordingly. ... because a 2MB cache probably doesn't ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: [PATCH] M68k cache
      ... changes to binfmt_.c are part of this patch. ... binfmt_.c it's used to flush data from the user cache and in ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: [PATCH 2.6.14] Adaptive read-ahead V6
      ... > I just replayed a benchmark with your new patch. ... > new bench output. ... The main delaysoccur in context method, which are triggered by cache misses. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [PATCH] allocate page caches pages in round robin fasion
      ... page cache pages should be spread out across the system ... The patch works by adding an alloc_page_round_robin routine that simply ... per-cpu variable modulo the number of nodes. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)