Re: TCP Keepalives Problem on Linux



Rick Jones wrote:
By definition, from the perspective of HostA when that single byte is
sent and remains unACKed, the connection remains "active" rather than
idle so the keepalives should not start. The connection will be
timed-out based on the "normal" retransmission mechanisms.

Only when/if there is no outstanding data on a connection should a
keepalive timer fire.

Rick,
Thanks for this information. It is useful for understanding what Linux
is doing under the covers.

It seems I will need to write some additional code. Fortunately the
existing application protocol I am using has a message that requires a
response. I can timeout that response by writing some extra code and
fix this specific problem. Unfortunately there are lots other
protocols that are also broken given this information.

The code I'm working with was originally developed on Tru64 and Tru64
doesn't implement their keepalive processing in the same way as Linux.
If data is not received on Tru64, the connection is considered idle and
keepalives begin. If data is not received after a given number of
keepalive probes, the connection is closed. This seems to make sense
and I wonder what it hurts for Linux to do something similar... unless
there are legitimate cases where this logic fails.

It seems like keepalive processing for Linux should be for data
received from the perspective of a host. If data is never received,
the connection should be closed after the timeout periods (assuming
keepalive is set and probes have been sent). Otherwise a host can
inadvertently prohibit keepalive processing from taking place by
attempting to send data to the other host (the one that isn't
responding).

Thanks again,
Sten

.



Relevant Pages

  • Re: Redhat Enterprise 4 and 15 second delays with NFS via TCP
    ... If the client is leaking connections, ... server will wait longer than the client before closing the connection ... This close the connection after not seeing ... > You can read my negative experiences with keepalives in that thread, ...
    (comp.protocols.nfs)
  • Re: Redhat Enterprise 4 and 15 second delays with NFS via TCP
    ... > server will wait longer than the client before closing the connection ... This close the connection after not seeing ... >> You can read my negative experiences with keepalives in that thread, ...
    (comp.protocols.nfs)
  • Re: Long keepidle time
    ... Most applications that open a large number of connections are smart enough to notice that the other end of the connection has gone silent and automatically terminate the connection. ... In the case of telnet / ssh sessions, you don't want keepalive being too active. ... So, in summary, there are many opinions about keepalives. ...
    (freebsd-net)
  • Re: TCP Keepalives Problem on Linux
    ... doesn't implement their keepalive processing in the same way as Linux. ... on the receiving side - if the receiving side does not have data ... the connection is closed. ... Indeed, if no data is received, and keepalives were enabled, I would ...
    (comp.os.linux.networking)
  • Re: sshd or ssh client times out after 2 hours
    ... > probe will detect the failure and close the connection. ... > router outage is short, then the keepalive will probably not happen ... > keepalives, the connection could survive a lengthy router outage. ... So the keepalive feature is meant primarily for the client? ...
    (comp.security.ssh)