Re: Mercurial 0.3 vs git benchmarks

From: Magnus Damm (magnus.damm_at_gmail.com)
Date: 04/26/05

  • Next message: Erik Hensema: "Re: filesystem transactions API"
    Date:	Tue, 26 Apr 2005 18:23:11 +0200
    To: Chris Mason <mason@suse.com>
    
    

    On 4/26/05, Chris Mason <mason@suse.com> wrote:
    > On Tuesday 26 April 2005 11:09, Magnus Damm wrote:
    > > On 4/26/05, Chris Mason <mason@suse.com> wrote:
    > > > This agrees with my tests here, the time to apply patches is somewhat
    > > > disk bound, even for the small 100 or 200 patch series. The io should be
    > > > coming from data=ordered, since the commits are still every 5 seconds or
    > > > so.
    > >
    > > Yes, as long as you apply the patches to disk that is. I've hacked up
    > > a small backend tool that applies patches to files kept in memory and
    > > uses a modifed rabin-karp search to match hunks. So you basically read
    > > once and write once per file instead of moving data around for each
    > > applied patch. But it needs two passes.
    > >
    > > And no, the source code for the entire Linux kernel is not kept in
    > > memory - you need a smart frontend to manage the file cache. Drop me a
    > > line if you are interested.
    >
    > Sorry, you've lost me. Right now the cycle goes like this:

    Ehrm, maybe I'm way off. =)

    > 1) patch reads patch file, reads source file, writes source file
    > 2) update-cache reads source file, writes git file

    Ok.

    > Which of those writes are you avoiding? We have a smart way to manage the
    > cache already for the source files...the vm does pretty well. There's
    > nothing to manage for the git files. For the apply a bunch of patches
    > workload, they are write once, read never (except for the index).

    Well, maybe I misunderstood everything, but I thought you were
    applying a lot of patches and complained that it took a lot of time
    due to the data order.

    When I applied a lot of patches to the kernel recently the cpu load
    dropped to zero after a while and the HD worked hard a sec or two and
    then things came back again. My primitive guess is that it was because
    the ext3 journal became full. To workaround this fact I started
    hacking on this in-memory patcher.

    In the cycle above, I'm trying to speed up step 1:
    If the patch modifies each source file multiple times (either using
    multiple hunks or multiple ---/+++) then the lines below the hunk in
    the source file will be moved multiple times. And if the source file
    is written to disk after each hunk or ---/+++ is applied then this
    will generate a lot of writes that can be avoided if the entire patch
    procedure is broken down into a first pass that analyzes the patches
    and a second pass that applies the patches and keeps source files in
    memory.

    But my rather trivial observation above is of course only suitable if
    you have a lot of patches that should be applied and you are only
    interested in the final version of the patched source files. If you
    apply one patch at a time and import each source file as a new
    revision then my little hack is probably not for you.

    / magnus
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Erik Hensema: "Re: filesystem transactions API"

    Relevant Pages

    • Re: [PATCH 1/2] Export cpu info by sysfs
      ... Some comments on your patch ... ... It's easier for others to read patches if they are inline text, ... kernel source file: Documentation/SubmittingPatches. ... The patch needs to include reasonable documentation (not ...
      (Linux-Kernel)
    • Re: Mercurial 0.3 vs git benchmarks
      ... as long as you apply the patches to disk that is. ... > applied patch. ... patch reads patch file, reads source file, writes source file ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: I dont have the CD with me -- why should I need it to update ins.
      ... Part of it is that the patches are kept small by being only incremental ... need for the original source file. ... There are other ways to mitigate the need, by how you install the software. ... hard drive that contains source files so that the installer doesn't need the ...
      (microsoft.public.officeupdate)
    • Re: How to make a port not build
      ... I need to modify a source file before the build starts. ... > Since many patches are included in the port, I am looking for it to stop ... To unsubscribe, ...
      (freebsd-questions)
    • 9_Recommended error codes (specifically return code 5)
      ... * "return code 2" indicates patches are already installed. ... * "return code 25" means a patches requires another patch that is not yet installed. ... With or without using the save option, the patch installation process ... Installing 114008-01... ...
      (SunManagers)