Re: Linux TCP - unexpected retransmissions
- From: Francois <fpomerle@xxxxxx>
- Date: 29 May 2007 06:15:00 -0700
On May 28, 10:54 pm, Allen McIntosh <nos...@xxxxxxxxxxxxxxxx> wrote:
Francois wrote:
Our are working on an embedded system that has a number of PowerQUICC
processors running Linux. During normal operation, processors exchange
small messages (< 100 bytes) using TCP. We have a response time
requirement of about 100 milliseconds and we observed that sometimes
we have a long latency in transporting (e.g., > 200 mlliseconds across
Ethernet link) messages between nodes of the system resulting in
response time exceeding our requirement. This latency occurs randomly
at different places and on different interface types. We set the
socket NO_DELAY option, tried different setting (proc file ipv4
options) and test programs to isolate the root cause of the latency
with no success.
We can reproduce the latency using a small application where two
PowerQuicc cards randomly send each other burst of messages across an
Ethernet link. For this test, we are using the 2.6.16 kernel. We use a
sniffer to capture data across the Ethernet link to realize that
sometimes when both TCPs send each other messages at about the same
time (segment 5 and 6 below), for unknown reasons, the second TCP does
not ack the message from the first TCP and a transmission occurs
re?
(segment 8). We also observed that retransmissions sometimes occur
when one TCP is busy transmitting many messages (segment 38 contains
many application messages) while a message is being sent to it, again,
for unknown reasons, that TCP does not ack the message thus forcing a
retransmission (segment 40).
Netstats reports TCP segments being retransmitted but no error at the
interface level. We have no reason to believe that segments are
dropped at the physical layer. We suspect that segments are dropped at
the TCP layer but we don't know why/where. Any ideas?
Did you try replacing whatever was in the middle (hub/switch/crossover
cable/...)? I know you said you don't suspect the link layer, but a
little paranoia never hurts.
Did you try using well-tested network cards? The machine I'm using to
write this has a built-in NIC that started mysteriously dropping packets
when I installed FC5. Switching to a well-debugged card/driver made the
problem go away.- Hide quoted text -
- Show quoted text -
Our system is composed of a number of embedded PowerQUICC processors
(VME) located within a number of shelves. Processors communicate using
point-to-point Ethernet links, or through the VME backplane. There is
no hub or switch between them (except when we use a sniffer for
testing purposes). We tried different cables, cards, shelves, etc, to
isolate the root cause of this latency with no success.
After browsing the Linux code for a while (I wish I understand it
better), we realized that the TCP stack optimizes performance by
separating the processing of events between user and kernel space. We
suspect that under certain conditions (heavy burst of messages, or
messages arriving at the same time), the stack drops or postpones
processing of events (holding locks, buffering) causing timers to
trigger retransmissions.
Thanks
Francois
.
- Follow-Ups:
- Re: Linux TCP - unexpected retransmissions
- From: Dan N
- Re: Linux TCP - unexpected retransmissions
- From: Rick Jones
- Re: Linux TCP - unexpected retransmissions
- References:
- Linux TCP - unexpected retransmissions
- From: Francois
- Re: Linux TCP - unexpected retransmissions
- From: Allen McIntosh
- Linux TCP - unexpected retransmissions
- Prev by Date: Re: forbid internet access to an application?
- Next by Date: Blocking iphiding sites
- Previous by thread: Re: Linux TCP - unexpected retransmissions
- Next by thread: Re: Linux TCP - unexpected retransmissions
- Index(es):
Relevant Pages
|