Re: When did I lost packets?



Spoon wrote:

I have written small bits of code to test high-rate packet handling.
(Approximately 10,000 packets per second.)

I send UDP packets at a constant rate from one computer:

while ( 1 )
{
sendto(sock, &seqno, sizeof seqno, 0,
(struct sockaddr *)&addr, sizeof addr);
++seqno;
busy_loop(100);
}

The only payload in the UDP packet is a 64-bit sequence number.
(Please ignore endianness issues.)

I use busy_loop(int us) to do nothing for 'us' micro-seconds.

This code is run by root, on an otherwise idle system, at the default scheduling policy, with nice -n -10



I receive the packets on a different computer:

while ( 1 )
{
recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
while ( E_seqno != R_seqno )
{
++lost; ++E_seqno;
}
++E_seqno;
}

R_seqno is the _received_ sequence number.
E_seqno is the _expected_ sequence number.
lost tracks the number of packets missed.

This code is run by root, on an otherwise idle system, as a SCHED_FIFO process, with priority 80. (Why 80? I don't know.)

param.sched_priority = 80;
if ( sched_setscheduler(0, SCHED_FIFO, &param) < 0 )
{
perror("sched_setscheduler");
}

I registered a signal handler to print statistics:

static void catch(int sig)
{
printf("RECEIVED=%llu LOST=%llu\n", R_seqno, lost);
}

signal(SIGQUIT, catch);

(AFAIU I'm not supposed to call printf() inside a signal handler?
However, I don't think it would explain why I drop packets. But I
could be wrong!)

I ran the setup overnight (1000 minutes) and here are my results:

According to top, the receive process ate 16.5 minutes of CPU time.
(i.e. 1.65% CPU occupancy on average.)
The system stays very responsive despite the SCHED_FIFO process.

RECEIVED=577.5 million packets
LOST=3225 packets

I don't understand why I lose ANY packet...

I forgot to mention: I increased the size of the socket buffer.
(That was my intention, at least.)

$ /sbin/sysctl net | grep rmem_
net.core.rmem_default = 1064960
net.core.rmem_max = 1064960

The link layer does not report any problem.
(errors:0 dropped:0 overruns:0 frame:0 can someone explain what
these numbers mean exactly?)

$ /sbin/ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:13:20:0D:1F:47
inet addr:10.10.10.208 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::213:20ff:fe0d:1f47/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:661166427 errors:0 dropped:0 overruns:0 frame:0
TX packets:20981 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1036860947 (988.8 Mb) TX bytes:1686030 (1.6 Mb)
Interrupt:9

I noticed that I lose packets in bursts of 30-100 packets, and these loss bursts are quite rare (~1 every 10-40 minutes). Someone told me another high-priority process (another SCHED_FIFO??) might be running.

I checked /var/log/messages and saw:
# cat /var/log/messages
Apr 27 04:40:01 venus syslogd 1.4.1: restart.
Apr 27 05:04:33 venus -- MARK --
Apr 27 05:24:34 venus -- MARK --
Apr 27 05:44:34 venus -- MARK --
Apr 27 06:04:34 venus -- MARK --

1 every 20 minutes... What do these log entries refer to?
Is it a high-priority process? Perhaps even a kernel thread?
Is it CPU-intensive? Could it explain why I drop packets?

I forgot to mention that the two computers are on the same LAN:

SENDER <---> ETHERNET <---> RECEIVER
SWITCH

AFAIK, the UDP stream is the only traffic on the LAN.

I turned syslogd and klogd off (I thought HDD access might make me drop packets. But the HDD controller performs DMA, right? So the CPU should be available to service network interrupts, even when the HDD is used?)

I'm still dropping packets (420 in 83 million).
.



Relevant Pages

  • Re: iptables performance
    ... There are connections and connections. ... actually know for how long) keep the way open for packets coming back the ... (as dropping all inbound ICMP would do -- f.ex. ... failure information returned by outbound UDP packets which were refused ...
    (comp.os.linux.networking)
  • RE: USB data verification-UDP packets
    ... The UDP packets I am talking about are the CETK ... > USB driver doesn't have much to do with networking, ...
    (microsoft.public.windowsce.app.development)
  • Re: Problem with writing fast UDP server
    ... > I wrote a simple case test: client and server. ... > packets within 0.137447118759 secs. ... I've used this script to test sending UDP packets. ... Uses port %d. ...
    (comp.lang.python)
  • Re: Determining if it is "safe" to send UDP packets
    ... I never had any problem with sending acqusition packets from Windows. ... If I send 1500 bytes UDP packets from windows from a process with high ... >> would probably lead me back to TCP. ... > next 100%-x% are not send, they are almost all completely lost. ...
    (microsoft.public.win32.programmer.kernel)
  • Re: Need help with UDP packets -- 10% success not good enough
    ... John Lindwall wrote: ... > simple program that receives UDP packets in a tight loop. ... the java code that generates the packets. ...
    (microsoft.public.pocketpc.developer)