Re: When did I lost packets?
- From: Spoon <root@xxxxxxxxx>
- Date: Thu, 27 Apr 2006 14:15:16 +0200
Spoon wrote:
I have written small bits of code to test high-rate packet handling.
(Approximately 10,000 packets per second.)
I send UDP packets at a constant rate from one computer:
while ( 1 )
{
sendto(sock, &seqno, sizeof seqno, 0,
(struct sockaddr *)&addr, sizeof addr);
++seqno;
busy_loop(100);
}
The only payload in the UDP packet is a 64-bit sequence number.
(Please ignore endianness issues.)
I use busy_loop(int us) to do nothing for 'us' micro-seconds.
This code is run by root, on an otherwise idle system, at the default scheduling policy, with nice -n -10
I receive the packets on a different computer:
while ( 1 )
{
recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
while ( E_seqno != R_seqno )
{
++lost; ++E_seqno;
}
++E_seqno;
}
R_seqno is the _received_ sequence number.
E_seqno is the _expected_ sequence number.
lost tracks the number of packets missed.
This code is run by root, on an otherwise idle system, as a SCHED_FIFO process, with priority 80. (Why 80? I don't know.)
param.sched_priority = 80;
if ( sched_setscheduler(0, SCHED_FIFO, ¶m) < 0 )
{
perror("sched_setscheduler");
}
I registered a signal handler to print statistics:
static void catch(int sig)
{
printf("RECEIVED=%llu LOST=%llu\n", R_seqno, lost);
}
signal(SIGQUIT, catch);
(AFAIU I'm not supposed to call printf() inside a signal handler?
However, I don't think it would explain why I drop packets. But I
could be wrong!)
I ran the setup overnight (1000 minutes) and here are my results:
According to top, the receive process ate 16.5 minutes of CPU time.
(i.e. 1.65% CPU occupancy on average.)
The system stays very responsive despite the SCHED_FIFO process.
RECEIVED=577.5 million packets
LOST=3225 packets
I don't understand why I lose ANY packet...
I forgot to mention: I increased the size of the socket buffer.
(That was my intention, at least.)
$ /sbin/sysctl net | grep rmem_
net.core.rmem_default = 1064960
net.core.rmem_max = 1064960
The link layer does not report any problem.
(errors:0 dropped:0 overruns:0 frame:0 can someone explain what
these numbers mean exactly?)
$ /sbin/ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:13:20:0D:1F:47
inet addr:10.10.10.208 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::213:20ff:fe0d:1f47/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:661166427 errors:0 dropped:0 overruns:0 frame:0
TX packets:20981 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1036860947 (988.8 Mb) TX bytes:1686030 (1.6 Mb)
Interrupt:9
I noticed that I lose packets in bursts of 30-100 packets, and these loss bursts are quite rare (~1 every 10-40 minutes). Someone told me another high-priority process (another SCHED_FIFO??) might be running.
I checked /var/log/messages and saw:
# cat /var/log/messages
Apr 27 04:40:01 venus syslogd 1.4.1: restart.
Apr 27 05:04:33 venus -- MARK --
Apr 27 05:24:34 venus -- MARK --
Apr 27 05:44:34 venus -- MARK --
Apr 27 06:04:34 venus -- MARK --
1 every 20 minutes... What do these log entries refer to?
Is it a high-priority process? Perhaps even a kernel thread?
Is it CPU-intensive? Could it explain why I drop packets?
I forgot to mention that the two computers are on the same LAN:
SENDER <---> ETHERNET <---> RECEIVER
SWITCH
AFAIK, the UDP stream is the only traffic on the LAN.
I turned syslogd and klogd off (I thought HDD access might make me drop packets. But the HDD controller performs DMA, right? So the CPU should be available to service network interrupts, even when the HDD is used?)
I'm still dropping packets (420 in 83 million).
.
- References:
- When did I lost packets?
- From: Spoon
- When did I lost packets?
- Prev by Date: Re: help on broadcast data lost in linux tcp/ip stack
- Next by Date: Re: Strange problems with ftp
- Previous by thread: When did I lost packets?
- Next by thread: NIS with compat mode
- Index(es):
Relevant Pages
|