RE: PATCH - InfiniBand Access Layer (IBAL)

From: Woodruff, Robert J (woody_at_co.intel.com)
Date: 03/15/04

  • Next message: Christoph Hellwig: "Re: dynamic sched timeslices"
    Date:	Mon, 15 Mar 2004 14:52:44 -0800
    To: "Greg KH" <greg@kroah.com>, "Woodruff, Robert J" <woody@jf.intel.com>
    
    

    On Sun, Mar 14, Greg KH wrote:

    First, As my boss remined me this morning,
    let me make sure I was clear, there are not 2 different efforts now,
    only one, openib.org.

    1) OpenIB represents a number of companies coming together with lots of
    InfiniBand source code,
    with duplicate code for the access layer and most of the ULPs
    2) the SourceForge work is already part of this
    3) the foundation of infiniband support will be the Access Layer, so it
    needs the community's feedback first
    4) we are looking for feedback on both the access layer code in the
    current openib snapshot and the access layer code that we submitted a
    few weeks ago
    to learn which is more acceptable to the community.

    Now to answer a couple specific questions.

    >Hm, without open source drivers, the Intel stack doesn't seem very
    >viable, correct?

    Correct, that is why we hope that Mellanox contributes their driver for
    IBAL to open source.

    >> The comments you have given on IBAL would probably only take a few
    >> weeks to change.
    >Is that work already underway? Finished? If neither, why not?

    Work is, or at least was underway, but
    we put it aside last week to review the rest of the code now in
    openib.org.
    We also need an open source driver.

    >What are the issues with the OpenIB stack?

    As I stated above, we are part of the openib.org collaboration and
    will be working on helping develop a stack that is "best of
    breed" from all of the available code. Starting from the bottom up,
    we first need to review the various proposals for the
    Access Layer and determine which code base to start with.
    The initial agreement was to use the
    TopSpin code for an access layer. This agreement was made before anyone
    got to see any code.
    After review of this code, we think it needs a lot of work. We were
    waiting for the openib.org email lists to open and sending in comments
    there.
    That way we could work a lot of details offline from lkml, since
    lots of discussion will be needed.

    But since you asked here are a few,

    1.) The tsapi APIs look like Windows APIs (at least in the original
    drop)
    2.) Looking at the API specification document,
    It is missing lots of verbs required by the InfiniBand Specification
    CloseCA, ModifyAV, QueryCQ, CreateEEC, ModifyEEC, QueryEEC,
    DestroyEEC, QueryMR, ReregisterMR, ReregisterPhysMR, RegisterSharedMR,
    AllocMW, QueryMW, BindMW, and FreeMW
    3.) The code is not compliant with the InfiniBand specification and has
    proprietary
    implementations of things like "path records" so it will only work with
    the
    TopSpin subnet manager that requires you to buy a topspin switch.
    4.) Not sure if they have fixed this yet in the 2.6 code, but the 2.4
    code
    has like 18 different loadable modules. This could probably be collapsed
    into 5, one for the HCA driver, one for the access layer, one for the
    IPoIB driver, one for the SRP driver and one for the SDP driver.
    5.) There is no user-mode access layer requiring ULPs to code to the HCA

    user-mode driver APIs directly.
    This will mean that new user mode ULPs will need to be
    developed for each new HCA that comes along.
    6.)The VAPI code has extra propietary verbs that are not specified by
    the InfiniBand
    Specification.
    7.) The implementation is deficient in it's support for InfiniBand
    management
    services, like the required RMPP protocol, MAD services, SA query helper
    functions.
    8.) Some of the message fields of the CM are hard coded.
    9.) The CM does not support reliable datagrams.
    10) There is no built in support for plug and play events, port up/down,
    LID change, SM change,
    11) VAPI call stack is deep and puts a lot of big data structure on the
    stack.

    There is more, but as I stated before, we suggest discussing most of
    these issues within
    openib.org first, trying to come to agreement on what is best and then
    review our
    suggestions with lkml to make sure we are one track.

    >If there are any, how does the Intel stack solve those
    >issues?

    The SourceForge code IBAL(not just developed by Intel but has
    contributions from several companies,
    including InfiniCon, Mellanox, Fujitsu and Intel)
    is feature complete and compliant with the InfiniBand specification. It
    may not be quite as
    hardened as the TopSpin stack, but that gap is rapidly closing.
    We'd also like to know from the other openib.org people,
    What are the issues with the SourceForge IBAL ?

    We know the issues raised by lkml and think these can be fixed.

    The biggest problem I see is that we do not have an open source HCA
    driver
    and that could be fixed too, if Mellanox wanted to, or someone could
    take the VAPI code they open sourced and port it to IBAL.

    >Could the Intel solutions be merged
    >into the OpenIB stack to solve these issues?

    Given there are so many issues with the TSAPI, would it be easier to
    fix the issues lkml raised with IBAL and port the "best of breed" ULPs
    to it ?
    Since all the tsAPIs will have to change anyway, to non-Windows-ize
    them,
    all the ULPs will need to be re-ported again anyway.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Christoph Hellwig: "Re: dynamic sched timeslices"

    Relevant Pages

    • [PATCH 2 of 39] IB/ipath - update copyrights and other strings to reflect new company name
      ... depends on IPATH_CORE && INFINIBAND ... in that we want to tell if the driver was built as ... * Copyright 2006 PathScale, ... + * access functions for the QLogic InfiniPath PE800, the PCI-Express chip. ...
      (Linux-Kernel)
    • Re: Opinion: NVIDIA drivers are a BAD Thing [tm]
      ... >>effort to support open source companies. ... > for a decent card, read a review on a linux site that said mid-range ... > and every driver except the windows driver on the cd in the box. ...
      (Fedora)
    • Re: feedback to NVidia [was: Nvidia Drivers]
      ... They dont HAVE to release a video driver ... >> that Linux was a market share that they wanted. ... > Free isn't even close to open source. ... I disagree with your statement that ...
      (Fedora)
    • Re: Supported Video Cards under Free Software
      ... reason we can't get a "good" open source driver. ... the nuts and bolts of sending data to hardware (various ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx ...
      (Debian-User)
    • Re: solved: 2.6.23 && fglrx && s2ram
      ... happily switch to a open source one once it properly runs mplayer, ... Renaming the config options for suspend/sleep broke fglrx power ... It looks like your patch is against the fglrx driver, ... The vendor has his own forums and support channels. ...
      (Linux-Kernel)