Re: + nbd-fix-tx-rx-race-condition.patch added to -mm tree

From: Herbert Xu (herbert_at_gondor.apana.org.au)
Date: 11/19/05

  • Next message: Dominik Brodowski: "Re: 2.6.15-rc1-mm2 0x414 Bad page states"
    Date:	Sun, 20 Nov 2005 09:34:19 +1100
    To: Paul Clements <paul.clements@steeleye.com>
    
    
    

    On Sat, Nov 19, 2005 at 11:02:13AM -0500, Paul Clements wrote:
    >
    > Thanks for this. It looks correct, for the most part, and seems to deal
    > with the various race conditions well (probably more cleanly than we
    > were doing before). There is one problem with this patch, though. You
    > have removed the acquiring of queue_lock from nbd_clear_que. I don't
    > know if that was intentional or not, but the queue_lock must be taken
    > when the queue is being modified. For instance, if nbd-client -d
    > (disconnect) gets called on an active nbd device, then you'll have a
    > race between requests being completed and pulled off the queue
    > (nbd_find_request) and nbd-client -d (which calls nbd_clear_que).

    This is intentional actually. nbd_clear_queue never races against
    nbd_find_request because the ioctl is protected by the BKL. If it
    weren't, then we have much worse problems to worry about (e.g.,
    while you're clearing the queue someone else could have set the
    socket again and started queueing requests).

    > >+ /*
    > >+ * lo->sock == NULL guarantees that the queue will not grow beyond
    > >the
    > >+ * current active request.
    > >+ */

    This comment is why we can do without the locking. The only other
    function that modifies the list (apart from nbd_find_request and
    nbd_clear_queueis) is nbd_do_request.

    When we enter nbd_clear_queue, lo->sock must have been set to NULL.
    In fact, we always take the tx_lock while setting lo->sock. So we
    have two possibilities:

    CPU0 CPU1
                                                    nbd_do_request
    down(tx_lock)
    lo->sock = NULL
    up(tx_lock)
                                                            down(tx_lock)
                                                            bail as lo->sock is
                                                                    NULL
                                                            up(tx_lock)
            nbd_clear_queue

    If any request occurs after the setting, then clearly the list will not
    be modified.

    CPU0 CPU1
                                                    nbd_do_request
                                                            down(tx_lock)
                                                            add req to list
                                                            up(tx_lock)
    down(tx_lock)
    lo->sock = NULL
    up(tx_lock)
            nbd_clear_queue

    If any request occurs before the setting, the list has been modified
    already. The up/down also gives us the required memory barriers so
    that we can operate on the list safely.

    In fact, I just realised that for the same reason active_req must be
    NULL as well. So here is an updated patch which removes the active_req
    wait in nbd_clear_queue and the associated memory barrier.

    I've also clarified this in the comment.

    [NBD] Fix TX/RX race condition

    Janos Haar of First NetCenter Bt. reported numerous crashes involving the
    NBD driver. With his help, this was tracked down to bogus bio vectors
    which in turn was the result of a race condition between the
    receive/transmit routines in the NBD driver.

    The bug manifests itself like this:

    CPU0 CPU1
    do_nbd_request
            add req to queuelist
            nbd_send_request
                    send req head
                    for each bio
                            kmap
                            send
                                    nbd_read_stat
                                            nbd_find_request
                                            nbd_end_request
                            kunmap

    When CPU1 finishes nbd_end_request, the request and all its associated
    bio's are freed. So when CPU0 calls kunmap whose argument is derived from
    the last bio, it may crash.

    Under normal circumstances, the race occurs only on the last bio. However,
    if an error is encountered on the remote NBD server (such as an incorrect
    magic number in the request), or if there were a bug in the server, it is
    possible for the nbd_end_request to occur any time after the request's
    addition to the queuelist.

    The following patch fixes this problem by making sure that requests are not
    added to the queuelist until after they have been completed transmission.

    In order for the receiving side to be ready for responses involving
    requests still being transmitted, the patch introduces the concept of the
    active request.

    When a response matches the current active request, its processing is
    delayed until after the tranmission has come to a stop.

    This has been tested by Janos and it has been successful in curing this
    race condition.

    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

    Cheers,

    -- 
    Visit Openswan at http://www.openswan.org/
    Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
    Home Page: http://gondor.apana.org.au/~herbert/
    PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/



  • Next message: Dominik Brodowski: "Re: 2.6.15-rc1-mm2 0x414 Bad page states"

    Relevant Pages