Re: syslog server, RH ES 4, large amounts of UDP loss. please help
- From: ibuprofin@xxxxxxxxxxxxxxxxxxxxxx (Moe Trin)
- Date: Fri, 04 Aug 2006 20:50:01 -0500
On 4 Aug 2006, in the Usenet newsgroup comp.os.linux.networking, in article
<1154694741.571928.5400@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>, guser@xxxxxxxxxxxxxxx
wrote:
With the firewall disabled I still saw the errors.
OK - that was a wild chance anyway
As for storage, I have logrotate setup to rotate and gzip the log files
once they reaches 500MB which gets the files to about 33MB on average.
Plus this is just the test system. Once we get it tuned we have another
set of systems with alot more storage.
I hope you have an efficient log parsing program - that's an awful lot
of data. Our log server barely sees 2 Megabytes a day, which equates to
roughly 560 pages of text.
I found one problem that reduced the udp errors which gave me a lead to
try something else.
[snip several things that cascade to overwhelm the stack and daemon]
The rate of error has dropped alot but then again this is all
overnight so the logging rate slows down
netstat -su
Udp:
1839508 packets received
26 packets to unknown port received.
Is this net publicly reachable? That could be windoze crap.
1510 packet receive errors
Better - but still shouldn't be happening
789415 packets sent
Over what period of time?
What I am now wondering is what netstat -su is really reporting as
errors. When I run ethtool -S eth0 I get a different set of stats
altogether. Below I see no recieve errors, but netstat reports recieve
errors. So is netstat looking at the errors from some kernel counter
while ethtool looks at the errors from the nic?
I don't know - the only sure way to find out is going to be to look at the
kernel and ethtool source code. As a _guess_ I'd suggest that the NIC
stats are only looking at the Ethernet level errors in the stack. Thus,
the packets are being lifted out of the hardware fast enough to avoid
overruns, they aren't short, partials, or have _overall_ CRC errors. I'll
come back to this below. The netstat errors relate to IP and TCP/UDP/ICMP
errors - higher in the stack. Now, if the Ethernet errors are nil, and
the UDP errors exist, then either the sending station is creating crap
data (that is cleanly being carried over the Ethernet without errors), or
the higher levels on the receiving system stack are tripping over themselves.
This _could_ be a scheduling problem - how busy is the receiving system?
What is the load averages? One presumes you aren't running X on this box -
does the error ratio improve if you either 'nice' the daemon to a higher
priority, OR reduce the number of packets coming in?
NIC statistics:
rx_packets: 3416841
tx_packets: 4004531
rx_bytes: 846988732
tx_bytes: 96819577
846988732 / 3416841 = 246 bytes average
96819577 / 4004531 = 24 bytes average
How long? (3416841 + 4004531 ) / 8 hours = 260 packets/second.
(846988732 + 96819577 ) / 8 hours = 32771 bytes/second = 262 kilobits/second
which should be well within a 10BaseT network, never mind a Gigabit card.
I don't see an obvious network problem... I dunno, I'd look at the ps
output to see if your logging daemon is getting enough time. Like I say,
ours doesn't see anywhere near that much traffic.
Old guy
.
- Follow-Ups:
- References:
- Prev by Date: Re: DHCP security
- Next by Date: Re: good and free dynamic DNS service?
- Previous by thread: Re: syslog server, RH ES 4, large amounts of UDP loss. please help
- Next by thread: Re: syslog server, RH ES 4, large amounts of UDP loss. please help
- Index(es):
Relevant Pages
|