Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

From: Caitlin Bestler (caitlin.bestler_at_gmail.com)
Date: 04/29/05

  • Next message: Linus Torvalds: "Re: Mercurial 0.4b vs git patchbomb benchmark"
    Date:	Fri, 29 Apr 2005 08:56:20 -0700
    To: Bill Jordan <woodennickel@gmail.com>
    
    

    On 4/29/05, Bill Jordan <woodennickel@gmail.com> wrote:
    > On 4/26/05, Andrew Morton <akpm@osdl.org> wrote:
    >
    > > Our point is that contemporary microprocessors cannot electrically do what
    > > you want them to do!
    > >
    > > Now, conceeeeeeiveably the kernel could keep track of the state of the
    > > pages down to the byte level, and could keep track of all COWed pages and
    > > could look at faulting addresses at the byte level and could copy sub-page
    > > ranges by hand from one process's address space into another process's
    > > after I/O completion. I don't think we want to do that.
    > >
    > > Methinks your specification is busted.
    >
    > I agree in principal. However, I expect this issue will come up with
    > more and more new specifications, and if it isn't addressed once in
    > the linux kernel, it will be kludged and broken many times in many
    > drivers.
    >
    > I believe we need an kernel level interface that will pin user pages,
    > and lock the user vma in a single step. The interface should be used
    > by drivers when the hardware mappings are done. If the process is
    > split into a user operation to lock the memory, and a driver operation
    > to map the hardware, there will always be opportunity for abuse.
    >
    > Reference counting needs to be done by this interface to allow
    > different hardware to interoperate.
    >
    > The interface can't overload the VM_LOCKED flag, or rely on any other
    > attributes that the user can tinker with via any other interface.
    >
    > And as much as I hate to admit it, I think on a fork, we will need to
    > copy parts of pages at the beginning or end of user I/O buffers.
    >

    I agree with all but the last part, in my opinion there is no need to deal
    with fork issues as long as solutions do not result in failures. There is
    *no* basis for a child process to expect that it will inherit RDMA resources.
    A child process that uses such resources will get undefined results, nothing
    further needs to be stated, and no heroic efforts are required to avoid them.

    What is definitely needed is kernel counting of locks on user pages.
    Finer granularity is not expected, it is the RDMA hardware that works
    at finer granularity. All it needs is to know what bus address a given
    virtual page maps to -- and it needs to know that said mapping will
    not change without advance notice.

    Further, any revocation of an existing mapping (to deal with hot page
    swapping or whatever) cannot expect the RDMA hardware to respond
    any faster than it would to invalidating a memory region.

    The RDMA hardware has an inherent need to cache translations.
    That is why it cannot guarantee that it will cease updating a memory
    region the nanosecond that a request is made to invalidate an STag.
    Instead it is allowed to block on such a request and only guarantees
    to have ceased access when the invalidate request completes.

    The same need for a delay exists for any interface that moves memory
    around, or requests to reclaim memory from the application.

    This also applies on process death. The hardware cannot stop on a dime.
    The best it can do is stop promptly, and given an unambiguous indication
    to the OS as to when it has stopped.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Linus Torvalds: "Re: Mercurial 0.4b vs git patchbomb benchmark"

    Relevant Pages

    • Re: open-sourced FPGA (vhdl, verilog, C variants) design libraries, working toward a GNU (for ha
      ... Software as bloat (which covers a lot of ground ... The issue in hardware is bloat, by ... Taking a fairly standard for instance, let's implement a simple memory ... interface, but with a wrinkle; the bus has a *minimum* speed (such as ...
      (comp.arch.fpga)
    • Application Error
      ... The instruction 0x10017116 which referred to memory 0x0000000000 the ... Posted using the http://www.windowsforumz.com interface, at author's request ...
      (microsoft.public.windowsxp.help_and_support)
    • Re: High memory
      ... memory and then copy it up above 1MB...but if you want to put ... outside the CPU, memory is seen in a completely different light...this ... 1MB with "real mode memory" labelled on it or anything...the actual memory ... the system bus to actual hardware devices...hence, ...
      (alt.lang.asm)
    • Re: Crashes and the Blue Screen of Death!
      ... but the most common cause is hardware failure. ... The most common cause of this is hardware memory corruption. ... are listed on the Windows 2000 Hardware Compatibility List. ... recommended that all users install them as they become available. ...
      (microsoft.public.win2000.general)
    • Re: The variable bit cpu
      ... > However looking at the big picture the effort to scale fixed bit hardware ... those bits are kept in the opcode rather than in each memory ... And while a *few* applications might need to scale beyond 256 bits, ... And one *big* thing you're missing here is that have a variable-bit ...
      (alt.lang.asm)