Re: [patch 4/10] s390: network driver.

From: Paul Jakma (paul_at_clubi.ie)
Date: 12/06/04

  • Next message: Jon Masters: "Re: How to understand flow of kernel code"
    Date:	Mon, 6 Dec 2004 18:44:09 +0000 (GMT)
    To: jamal <hadi@cyberus.ca>
    
    

    On Mon, 6 Dec 2004, jamal wrote:

    > Dont post networking related patches on other lists. I havent seen said
    > patch, but it seems someone is complaining about some behavior changing?

    I missed the beginning of the thread too, but saw Jeff's reply to
    Thomas on netdev. It appears the original patch was to make the s390
    network driver discard packets on link-down.

    Jeff had replied to say this was bad, that queues are meant to fill
    and that this was what other drivers (e1000, tg3) did.

    > In regards to link down and packets being queued. Agreed this is a
    > little problematic for some apps/transports.

    Tis yes. Particularly for apps using raw and UDP+IP_HDRINCL sockets.

    This problem came to light when we got reports of ospfd blocking
    because link was down, late in 2.4 with a certain version of the
    (iirc) e100 driver. ospfd uses one single socket for all interfaces,
    and relies on IP_HDRINCL to have the packet routed out right
    interface. However this approach doesnt play well if the socket can
    be blocked completely because of /one/ interface having its link
    down. The behaviour we expected (and got up until now) is to receive
    either ENOBUFS or else, if the kernel accepts the packet write, for
    it to drop it if it can not be sent.

    We can work around that by moving to a socket/interface. However it
    still leaves the problem of packets being queued indefinitely while
    the link is down and being sent when link comes back. This is *not*
    good for RIP, IPv4 IRDP and IPv6 RA.

    > In the case the netdevice is administratively downed both the qdisc
    > and DMA ring packets are flushed.

    What about any packets remaining in the socket buffer? (if that makes
    sense - i dont know enough about internals sadly). Are those queued?

    > Newer packets will never be queued

    That no longer appears to be the case though. The socket blocks, and
    /some/ packets are queued (presumably those which still were in the
    socket buffer? i dont know exactly..).

    > and you should quickly be able to find from your app that
    > the device is down.

    We can yes, via rtnetlink - but impossible to guarantee we'll know
    the link is down before we try write a packet.

    > In the case of netdevice being operationally down

    ?

    As in 'ip link set dev ... down'?

    > - I am hoping this is what the discussion is, having jumped on it -

    No, its for link-down, AIUI.

    > both queues stay intact. What you can do is certainly from user
    > space admin down/up the device when you receive a netlink carrier
    > off notification.

    That seems possible, but quite a hack. Something to work at a socket
    level would possibly be nicer. (Socket being the primary handle our
    application has).

    > I am struggling to see whether dropping the packet inside the tx
    > path once it is operationaly down is so blasphemous ... need to get
    > caffeine first.

    As long as reliable transports have some other transport specific
    queue, shouldnt be a problem. For UDP and raw no reliability or
    guarantees are expected by applications (least shouldnt be), and
    queueing some packets on link-down interferes with application-layer
    expectations.

    > cheers,
    > jamal

    regards,

    -- 
    Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
    Fortune:
    The UPS doesn't have a battery backup.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Jon Masters: "Re: How to understand flow of kernel code"

    Relevant Pages

    • IP_BOUND_IF
      ... I'm trying to make Apache send outbound TCP packets through the same ... interface on which incoming packages were received. ... The option is being set only on the listening sockets and perhaps sockets ... socket should be good enough for the first SYN-ACK packet. ...
      (comp.unix.solaris)
    • Re: SO_BINDTODEVICE
      ...     in the passed interface name. ...     processed by the socket. ... does not mean that if you bind to an Ethernet interface, only packets ...
      (comp.os.linux.networking)
    • Re: python libpcap equivalent
      ... socket to send them out again on whatever interface you want. ... under Linux you can also capture packets using a raw-mode ... user-space app to "pretend" to be an Ethernet interface in the ...
      (comp.lang.python)
    • Re: Problem with writing fast UDP server
      ... UDP packets per second. ... socket and threads. ... I wrote a simple case test: client and server. ... The maximum theoretical limit is 14,880 frames per ...
      (comp.lang.python)
    • raw sockets and blocking
      ... cable of one interface has been pulled? ... which uses a single raw, AF_INET/OSPF socket and manages it's own IP ... to send/receive OSPF packets to/from a number of interfaces. ...
      (Linux-Kernel)