When did I lost packets?
- From: Spoon <root@xxxxxxxxx>
- Date: Thu, 27 Apr 2006 11:08:57 +0200
Hello everyone,
I have written small bits of code to test high-rate packet handling.
(Approximately 10,000 packets per second.)
I send UDP packets at a constant rate from one computer:
while ( 1 )
{
sendto(sock, &seqno, sizeof seqno, 0,
(struct sockaddr *)&addr, sizeof addr);
++seqno;
busy_loop(100);
}
The only payload in the UDP packet is a 64-bit sequence number.
(Please ignore endianness issues.)
I use busy_loop(int us) to do nothing for 'us' micro-seconds.
This code is run by root, on an otherwise idle system, at the default scheduling policy, with nice -n -10
I receive the packets on a different computer:
while ( 1 )
{
recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
while ( E_seqno != R_seqno )
{
++lost; ++E_seqno;
}
++E_seqno;
}
R_seqno is the _received_ sequence number.
E_seqno is the _expected_ sequence number.
lost tracks the number of packets missed.
This code is run by root, on an otherwise idle system, as a SCHED_FIFO process, with priority 80. (Why 80? I don't know.)
param.sched_priority = 80;
if ( sched_setscheduler(0, SCHED_FIFO, ¶m) < 0 )
{
perror("sched_setscheduler");
}
I registered a signal handler to print statistics:
static void catch(int sig)
{
printf("RECEIVED=%llu LOST=%llu\n", R_seqno, lost);
}
signal(SIGQUIT, catch);
(AFAIU I'm not supposed to call printf() inside a signal handler?
However, I don't think it would explain why I drop packets. But I
could be wrong!)
I ran the setup overnight (1000 minutes) and here are my results:
According to top, the receive process ate 16.5 minutes of CPU time.
(i.e. 1.65% CPU occupancy on average.)
The system stays very responsive despite the SCHED_FIFO process.
RECEIVED=577.5 million packets
LOST=3225 packets
I don't understand why I lose ANY packet...
I forgot to mention: I increased the size of the socket buffer.
(That was my intention, at least.)
$ /sbin/sysctl net | grep rmem_
net.core.rmem_default = 1064960
net.core.rmem_max = 1064960
The link layer does not report any problem.
(errors:0 dropped:0 overruns:0 frame:0 can someone explain what
these numbers mean exactly?)
$ /sbin/ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:13:20:0D:1F:47
inet addr:10.10.10.208 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::213:20ff:fe0d:1f47/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:661166427 errors:0 dropped:0 overruns:0 frame:0
TX packets:20981 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1036860947 (988.8 Mb) TX bytes:1686030 (1.6 Mb)
Interrupt:9
I noticed that I lose packets in bursts of 30-100 packets, and these loss bursts are quite rare (~1 every 10-40 minutes). Someone told me another high-priority process (another SCHED_FIFO??) might be running.
I checked /var/log/messages and saw:
# cat /var/log/messages
Apr 27 04:40:01 venus syslogd 1.4.1: restart.
Apr 27 05:04:33 venus -- MARK --
Apr 27 05:24:34 venus -- MARK --
Apr 27 05:44:34 venus -- MARK --
Apr 27 06:04:34 venus -- MARK --
Apr 27 06:24:34 venus -- MARK --
Apr 27 06:44:35 venus -- MARK --
Apr 27 07:04:35 venus -- MARK --
Apr 27 07:24:35 venus -- MARK --
Apr 27 07:44:35 venus -- MARK --
Apr 27 08:04:36 venus -- MARK --
Apr 27 08:24:36 venus -- MARK --
Apr 27 08:44:36 venus -- MARK --
Apr 27 09:04:36 venus -- MARK --
Apr 27 09:24:37 venus -- MARK --
Apr 27 09:44:37 venus -- MARK --
Apr 27 10:04:37 venus -- MARK --
Apr 27 10:24:37 venus -- MARK --
1 every 20 minutes... What do these log entries refer to?
Is it a high-priority process? Perhaps even a kernel thread?
Is it CPU-intensive? Could it explain why I drop packets?
If you've read this far, THANKS! :-)
Regards,
Spoon
.
- Follow-Ups:
- Re: When did I lost packets?
- From: Spoon
- Re: When did I lost packets?
- Prev by Date: 2 NIC in same Server
- Next by Date: NIS with compat mode
- Previous by thread: 2 NIC in same Server
- Next by thread: Re: When did I lost packets?
- Index(es):
Relevant Pages
|