Dog slow TCP on 100 megabit/s LAN

From: Simon Paradis (see.end.of.message_at_for.my.email.com)
Date: 12/30/03


Date: Tue, 30 Dec 2003 20:03:59 GMT

Hi,

I'm having some TCP connectivity problems involving a WinXP box, Linux box
and a router to share internet access and have a LAN for those two machines.
The problem is that TCP is dog slow and freezes when the box talks to
themselves directly on the LAN side.

Here is my setup:

Windows XP box
------------------
CPU: AMD AthlonXP 1.4 GHZ
RAM: 256 MB
NIC: Onboard Realtek RTL8139/810X Family Fast Ethernet NIC
IP : 192.168.1.2

Router
------------------
Linksys BEFSR41 ver.3 with latest firmware 1.04.8 (with 4 ports 100 megabit
switch)
DHCP server disabled (I use fixed IP on WinXP & Linux)
UPnP disabled
NAT enabled (for sharing internet access)
LAN side IP: 192.168.1.1
WAN port connected to a Dlink DSL-302 DSL modem. There's no PPPoE, the
connection works directly using DHCP like a standard LAN.

Linux Mandrake 9.2
------------------
CPU: AMD Duron 600 Mhz
RAM: 448 MB
NIC: AOpen AON-325 Fast Ethernet (based on RTL8139 chipset)
IP : 192.168.1.3
Kernel: 2.4.22-10mdk (stock kernel from Mandrake)
NIC driver: 8139too
ifconfig output:
eth0 Link encap:Ethernet HWaddr 00:48:54:65:7F:5D
    inet addr:192.168.1.3 Bcast:192.168.1.255 Mask:255.255.255.0
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:6271441 errors:3168383 dropped:0 overruns:0 frame:0
    *** I don't know when that huge amount of error occured ***
    TX packets:4413195 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:100
    RX bytes:290905469 (277.4 Mb) TX bytes:2019588565 (1926.0 Mb)
    Interrupt:9 Base address:0xef00

TCP connection hangs shortly after starting. I made java utility for
sending/receiving data over TCP (like ttcp). I tried it with the same
hardware setup but running Solaris 9 x86 on the AMD Duron box and I was
getting approximatly 95 MBit/s at the TCP level (just pure TCP I/O; no disk
read/write are involved). Taking into account TCP, IP, Ethernet headers and
preambles, this gives a raw line speed near 100 megabit/s ! I get the
similar performance when running WinXP on the AMD Duron box. You definitely
cannot get over that speed. I was running Solaris CDE remotely through using
XDMCP and it was almost the same as working locally. Now with Linux; it just
sucks; an xterm can take 30 sec to appear. Once it's loaded; it work fine
tough....

Now with Linux, TCP connection hangs. I ran tcpdump on the Linux box when
running the java program described above. It was sending data from WinXP to
Linux (one way only). You can see the unusual delays. I get the same
problems when sending data from Linux to WinXP.

... [start of tcpdump] ...

13:05:29.400937 192.168.1.2.1130 > 192.168.1.3.5005: S
3336550493:3336550493(0) win 64240 <mss 1460,nop,nop,sackOK> (DF)
13:05:29.400998 192.168.1.3.5005 > 192.168.1.2.1130: S
2165233103:2165233103(0) ack 3336550494 win 5840 <mss 1460,nop,nop,sackOK>
(DF)
13:05:29.401112 192.168.1.2.1130 > 192.168.1.3.5005: . ack 1 win 64240 (DF)
... [this is the first segment; normally several thousands of them should
pass very quickly]....
13:05:29.440189 192.168.1.2.1130 > 192.168.1.3.5005: . 1:1461(1460) ack 1
win 64240 (DF)

... [snip; only 21901 bytes sent after 3 seconds] ...

13:05:32.766509 192.168.1.3.5005 > 192.168.1.2.1130: . ack 21901 win 35040
<nop,nop,sack sack 1 {20441:21901} > (DF)
*** Very abnormal: 3 sec elapsed with no data from 192.168.1.2 ***
13:05:35.771118 192.168.1.2.1130 > 192.168.1.3.5005: . 23361:24821(1460) ack
1 win 64240 (DF)
13:05:35.771192 192.168.1.3.5005 > 192.168.1.2.1130: . ack 24821 win 40880
(DF)
*** Again, 6 second delay ***
13:05:41.778868 192.168.1.2.1130 > 192.168.1.3.5005: . 23361:24821(1460) ack
1 win 64240 (DF)

.... [snip] ...

13:06:22.820826 arp who-has 192.168.1.2 tell 192.168.1.3
13:06:22.820964 arp reply 192.168.1.2 is-at 0:a:e6:21:71:e0
13:07:05.795333 192.168.1.2.1130 > 192.168.1.3.5005: . 27741:29201(1460) ack
1 win 64240 (DF)
13:07:05.795593 192.168.1.3.5005 > 192.168.1.2.1130: . ack 29201 win 49640
<nop,nop,sack sack 1 {27741:29201} > (DF)
13:07:05.795971 192.168.1.2.1130 > 192.168.1.3.5005: . 29201:30661(1460) ack
1 win 64240 (DF)
13:07:05.796042 192.168.1.3.5005 > 192.168.1.2.1130: . ack 30661 win 52560
(DF)
13:07:10.790827 arp who-has 192.168.1.2 tell 192.168.1.3
13:07:10.790958 arp reply 192.168.1.2 is-at 0:a:e6:21:71:e0
.... [then nothing; TCP connection is frozen] ...

As expected, the LED didn't blink too much on the router and the NIC's. I
also tried uploading a big file from WinXP to Linux using sftp and got the
same result. The upload hangs as soon as it starts. However, interactive ssh
sessions works fine. Downloading small (< 5kb) files using sftp also works
ok.

Now here is an interesting thing; when I link those two boxes using a normal
100 megabit/s layer-2 switch, the transmission works fine but I only get 70
megabit/s instead of 92/95 megabit/s using Solaris or WinXP.

Even stranger; when running the same test on Solaris OR WinXP on the AMD
Duron box through the Linksys router, the performance was still 95 megabits
(the router was just as fast as my other little switch). When I download
stuff on the net using the Linux or WinXP box, I can get ~215 kb/sec which
is the max my DSL connection supports. Internet works just fine.

If I do a netstat, it shows that very few TCP segments have been sent during
a time interval where several thousands of them should have passed. However,
If I do a ping flood with 65000 bytes packet (ping -A -s 65000 192.168.1.2),
no packets are losts and the thing just run super fast with a steady round
trip time of 11.5 ms (with ~1 ms mdev).

I power-cycled the router and resetted it configuration several times and
also tried it without hooking the DSL modem to it. I always get the same
problem. Do you think it's due to the router or Linux ? I don't think it's a
cable problem.

It looks like the problem is TCP related as the ICMP ping works juste fine.
Do you want can I do fix this ?

Thanks for your help,

My email is "simon [dot] paradis [at] usherbrooke [dot] ca".



Relevant Pages

  • Dog slow TCP on 100 megabit/s LAN
    ... I'm having some TCP connectivity problems involving a WinXP box, ... and a router to share internet access and have a LAN for those two machines. ... DHCP server disabled (I use fixed IP on WinXP & Linux) ...
    (comp.os.linux.networking)
  • Re: Dog slow TCP on 100 megabit/s LAN
    ... How much buffer you have on the linux box ?. ... Maximum thruput at TCP level is ... > and a router to share internet access and have a LAN for those two ... > similar performance when running WinXP on the AMD Duron box. ...
    (comp.os.linux.networking)
  • Re: Dog slow TCP on 100 megabit/s LAN
    ... How much buffer you have on the linux box ?. ... Maximum thruput at TCP level is ... > and a router to share internet access and have a LAN for those two ... > similar performance when running WinXP on the AMD Duron box. ...
    (comp.os.linux.setup)
  • Re: suse is blocking ethernet card ?
    ... I use to have only winXP on my PC, but I would like to really learn ... (WinXP + Suse) ... , and whenever I'm working with linux, everything is fine, I ... under the windows the control light (of router) for the cable ...
    (alt.os.linux.suse)
  • Re: best distro for security
    ... I'm not that familiar with linux, I mean, I have made some C ... all WinXP machines, just one is connected to the internet with an ADSL ... modem and it shares this connection with the other pcs on the LAN. ... Why not use a router to connect your modem to your WinXP computers? ...
    (comp.os.linux.security)