Re: TCP SACK issue, hung connection, tcpdump included



On Sun, 29 Jul 2007, Willy Tarreau wrote:

On Sun, Jul 29, 2007 at 11:26:00AM +0300, Ilpo Järvinen wrote:
On Sun, 29 Jul 2007, Willy Tarreau wrote:

On Sun, Jul 29, 2007 at 06:59:26AM +0100, Darryl L. Miles wrote:
CLIENT = Linux 2.6.20.1-smp [Customer build]
SERVER = Linux 2.6.9-55.ELsmp [Red Hat Enterprise Linux AS release 4
(Nahant Update 5)]

The problems start around time index 09:21:39.860302 when the CLIENT issues
a TCP packet with SACK option set (seemingly for a data segment which has
already been seen) from that point on the connection hangs.

...That's DSACK and it's being correctly sent. To me, it seems unlikely to
be the cause for this breakage...

Where was the capture taken ? on CLIENT or on SERVER (I suspect client from
the timers) ?

...I would guess the same based on SYN timestamps (and from the DSACK
timestamps)...

A possible, but very unlikely reason would be an MTU limitation
somewhere, because the segment which never gets correctly ACKed is also the
largest one in this trace.

Limitation for 48 byte segments? You have to be kidding... :-) But yes,
it seems that one of the directions is dropping packets for some reason
though I would not assume MTU limitation... Or did you mean some other
segment?

No, I was talking about the 1448 bytes segments. But in fact I don't
believe it much because the SACKs are always retransmitted just afterwards.

Ah, but it's ACKed correctly right below it...:

[...snip...]
09:21:39.490740 IP SERVER.ssh > CLIENT.50727: P 18200:18464(264) ack 2991
win 2728 <nop,nop,timestamp 7692910 800001727>
09:21:39.490775 IP CLIENT.50727 > SERVER.ssh: . ack 18464 win 378
<nop,nop,timestamp 800001755 7692910>
09:21:39.860245 IP SERVER.ssh > CLIENT.50727: . 12408:13856(1448) ack 2991
win 2728 <nop,nop,timestamp 7693293 800001749>

...segment below snd_una arrived => snd_una remains 18464, receiver
generates a duplicate ACK:

09:21:39.860302 IP CLIENT.50727 > SERVER.ssh: . ack 18464 win 378
<nop,nop,timestamp 800001847 7692910,nop,nop,sack sack 1 {12408:13856} >

The cumulative ACK field of it covers _everything_ below 18464 (i.e., it
ACKs them), including the 1448 bytes in 12408:13856... In addition, the
SACK block is DSACK information [RFC2883] telling explicitly the address
of the received duplicate block. However, if this ACK doesn't reach the
SERVER TCP, RTO is triggered and the first not yet cumulatively ACKed
segment is retransmitted (I guess cumulative ACKs up to 12408 arrived
without problems to the SERVER):

09:21:40.453440 IP SERVER.ssh > CLIENT.50727: . 12408:13856(1448) ack 2991
win 2728 <nop,nop,timestamp 7693887 800001749>
[...snip...]

BTW, some information are missing. It would have been better if the trace
had been read with tcpdump -Svv. We would have got seq numbers and ttl.
Also, we do not know if there's a firewall between both sides. Sometimes,
some IDS identify attacks in crypted traffic and kill connections. It
might have been the case here, with the connection closed one way on an
intermediate firewall.

Yeah, firewall or some other issue, I'd say it's quite unlikely a bug in
TCP because behavior to both directions indicate client -> sender
blackhole independently of each other...


--
i.

Relevant Pages

  • Re: Sockets
    ... If you look at how TCP works, you discover that having your remote ... you send a full, hence complete segment. ... small data elements into a single segment to keep performance ... The oponent - once it will get your segment will send an ACK back to ...
    (microsoft.public.pocketpc.developer)
  • Re: congestion control, Netware 5.1 <-> IRIX 6.5
    ... Is a 'segment' what you ... get after reassembly of all fragments carried in individual packets? ... more packets from Novell right after it gets the ACK ... I am not sure why the tcpdump 'bad tcp checksum' are occuring. ...
    (comp.sys.sgi.misc)
  • Re: HEADS UP: SACK committed to HEAD
    ... >> acknowledge TCP segments and retransmit them, ... >> Especially if you have wireless links, SACK can be a huge improvement. ... except that it's only the end-hosts in the TCP connection. ... R. Kevin Oberman, Network Engineer ...
    (freebsd-current)
  • Linux TCP - unexpected retransmissions
    ... small messages using TCP. ... This latency occurs randomly ... time (segment 5 and 6 below), for unknown reasons, the second TCP does ... three instances of a retransmission. ...
    (comp.os.linux.networking)
  • Re: How long does read(2) wait before an EAGAIN is thrown?
    ... The 'Default Ethernet MTU' for wired devices is 1500 octets. ... as to whether or where the TCP frame is fragmented being broken, ... segment size it can deal with as part of connection initiation. ... sending large segments with the DF-('don't fragment') flag set in the ...
    (comp.unix.programmer)