possible tunnel driver bug?



I have a small program that rewrites TCP packets for a system that I am
working on. The program has two threads, largely for historical
reasons. (It's cut down from a larger earlier program). One thread
reads packets from a tunnel interface (using the tun driver) and queues
them to be written back to the kernel later.

Some of the packets appear to be corrupted after being written back to
the kernel. The latest kernel this has been tested on is 2.6.9 (RedHat
ES4). Here is an example of a trace for a packet that I wrote to the
kernel. The bytes are dumped just before the writev() system call. (The
first line is a timestamp)

21,020215 -> k:
TCP SYN ACK sport = 1202 dport = 59346 seq = 620090535 ack = 615496366
window = 5840 doff = 8 csum = 32434
IP total length = 52 ip checksum = 9918 fragoff = 16384 saddr
= 10.2.0.2 daddr=10.2.0.1
00 00 08 00 45 00 00 34 00 00 40 00 40 06 26 be ....E..4..@.@.&.
0a 02 00 02 0a 02 00 01 04 b2 e7 d2 24 f5 d4 a7 ............$...
24 af ba ae 80 12 16 d0 7e b2 00 00 02 04 05 b4 $.......~.......
01 01 04 02 01 03 03 00 ........

At the same time tcpdump showed

17:33:41.027397 10.2.0.2.1202 > 10.2.0.1.59346: S [bad tcp cksum 40fc!]
620090535:620090535(0) ack 615
496366 win 5840 <eol> (DF) (ttl 64, id 0, len 52)
0x0000 4500 0034 0000 4000 4006 26be 0a02 0002 E..4..@.@.&.....
0x0010 0a02 0001 04b2 e7d2 24f5 d4a7 24af baae ........$...$...
0x0020 8012 16d0 7eb2 0000 0064 0001 0203 0405 ....~....d......
0x0030 0607 0809 ....

The packets differ starting at byte 0x28 (in the tcpdump display). This
is the TCP options section. It is also an iovec boundary. The bytes you
can see are actually application data from a recent data packet. It
looks as if the kernel failed to copy in all of the packet and left
trash in the socket buffer exposed.

This bug is intermittent and timing related. If I slow the program down
enough then it all works perfectly. My best guess is that the problem
is triggered when both threads in the program are trying to do I/O at
the same time, one reading and one writing. I've since rewritten the
program to use the one thread and it all works perfectly.

Has anyone seen this sort of thing before?

.



Relevant Pages

  • [UNIX] Local Netfilter / IPTables IP Queue PID Wrap Flaw
    ... Beyond Security would like to welcome Tiscali World Online ... and a userspace library which allow userspace mediation and modification ... NET_ADMIN capability) to process packets from the kernel. ...
    (Securiteam)
  • Re: Q: locking mechanisms
    ... rcu_read_lockI disable preemption which I thought affects more ... In any kernel in which rcu_read_lockdisables preemption, ... types of PF_CAN sockets, which register for packets of certain CAN ...
    (Linux-Kernel)
  • Router stops routing after changing MAC Address
    ... Hello - This feels like a kernel issue. ... How to change MAC addresses is documented well enough - and it works - ... ip link set eth0 down ... the right side and back with echo request and reply packets. ...
    (Linux-Kernel)
  • Re: Send-Q on UDP socket growing steadily - why?
    ... Send-Q on a moderately active UDP socket keeps growing steadily until it ... The application in question is standard ntpd from Fedora 7, kernel is ... and drops packets. ...
    (Linux-Kernel)
  • Re: My boss want to kill debian, please help !
    ... > Thanks a lot, hmm, i always thought vanilla kernel are best, i will ... which BIOS revision? ... The 3COMs are losing packets (but the machine is rock solid, ... "One disk to rule them all, ...
    (Debian-User)