Re: tcp close problems under heavy load on 2.6.18
- From: Rick Jones <rick.jones2@xxxxxx>
- Date: Tue, 06 Mar 2007 23:20:38 GMT
Jean-Francois Smigielski <smig@xxxxxxxxxxxxxx> wrote:
I met a problem with with tcp connections on a linux 2.6.18 (both
clients and servers).
I have an echo service that can be represented by 1, 2, 3 or 4
processes that listen on the same ip/port. This service accepts tens
of thousands of simultaneous connections. Each client process starts
thousands of connections to the service, writes some data, read the
sanswer,wait, close, and then open, write, ...
Both client and server sockets are non-Blocking and use the options
SO_LINGER to avoid letting a lot of sockets in a TIME_WAIT state. I
started with a linger time-out of 0.
I thought it was generally agreed that deliberately causing abortive
closes that way was a "bad thing" - for example, RST's are not
retransmitted, so you could leave the remote in ESTABLISHED etc for a
very long time... And TIME_WAIT is there for a reason - to protect
against the accidental acceptance of old segments from a TCP
connection of the same name.
If I kill the client processes of a host, killing so thousands of
connections at a time, I should observe many tcp RST-flagged
packets, at least one for every socket. But only a part of those
packets are sent, for one half of the original number of
sockets. This happens with more than 4 thousands of client sockets.
Are you certain that your packet sniffer actually saw all the packets?
Sometimes even pcap reporting zero drops doesn't necessarily mean it
did see all the traffic.
If you were tracing on the server, back on the client, a sudden spike
of 4000 RST's going out at once might have filled the driver/NIC's
transmit queues and so some of them may have been dropped, never to be
seen again... It is possible that if you were tracing on the client
that those drops happened before the promiscuous tap (I'm not certain
of that, just speculating).
The observed effect on the server is obvious : all the badly closed
sockets remain in ESTABLISHED state, since the server only answers
to received data...
Ah, so you do see then firsthand one of the reasons an abortive close
of a TCP connection is considered a Bad Thing :)
rick jones
--
oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
.
- Follow-Ups:
- Re: tcp close problems under heavy load on 2.6.18
- From: Jean-Francois Smigielski
- Re: tcp close problems under heavy load on 2.6.18
- References:
- tcp close problems under heavy load on 2.6.18
- From: Jean-Francois Smigielski
- tcp close problems under heavy load on 2.6.18
- Prev by Date: tcp close problems under heavy load on 2.6.18
- Next by Date: list computers in network
- Previous by thread: tcp close problems under heavy load on 2.6.18
- Next by thread: Re: tcp close problems under heavy load on 2.6.18
- Index(es):
Relevant Pages
|
Loading