Re: Wrong network usage reported by /proc



Eric Dumazet wrote :

Matthias Saou a écrit :
Hi,

I'm posting here as a last resort. I've got lots of heavily used RHEL5
servers (2.6.18 based) that are reporting all sorts of impossible
network usage values through /proc, leading to unrealistic snmp/cacti
graphs where the outgoing bandwidth used it higher than the physical
interface's maximum speed.

For some details and a test script which compares values from /proc
with values from tcpdump :
https://bugzilla.redhat.com/show_bug.cgi?id=489541

The values collected using tcpdump always seem realistic and match the
values seen on the remote network equipments. So my obvious conclusion
(but possibly wrong given my limited knowledge) is that something is
wrong in the kernel, since it's the one exposing the /proc interface.

I've reproduced what seems to be the same problem on recent kernels,
including the 2.6.27.21-170.2.56.fc10.x86_64 I'm running right now. The
simple python script available here allows to see it quite easily :
https://www.redhat.com/archives/rhelv5-list/2009-February/msg00166.html

* I run the script on my Workstation, I have an FTP server enabled
* I download a DVD ISO from a remote workstation : The values match
* I start ping floods from remote workstations : The values reported
by /proc are much higher than the ones reported by tcpdump. I used
"ping -s 500 -f myworkstation" from two remote workstations

If there's anything flawed in my debugging, I'd love to have someone
point it out to me. TIA to anyone willing to have a look.

Matthias


I could not reproduce this here... what kind of NIC are you using on
affected systems ? Some ethernet drivers report stats from card itself,
and I remember seeing some strange stats on some hardware, but I cannot
remember which one it was (we were reading NULL values instead of
real ones, once in a while, maybe it was a firmware issue...)

My workstation has a Broadcom BCM5752 (tg3 module). The servers which
are most affected have Intel 82571EB (e1000e). But the issue is that
with /proc, the values are a lot _higher_ than with tcpdump, and the
tcpdump values seem to be the correct ones.

Matthias

--
Clean custom Red Hat Linux rpm packages : http://freshrpms.net/
Fedora release 10 (Cambridge) - Linux kernel
2.6.27.21-170.2.56.fc10.x86_64 Load : 2.20 0.88 0.42
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Wrong network usage reported by /proc
    ... servers that are reporting all sorts of impossible ... The values collected using tcpdump always seem realistic and match the ... since it's the one exposing the /proc interface. ... otherwise I don't see the reason to delay their reporting. ...
    (Linux-Kernel)
  • Re: Wrong network usage reported by /proc
    ... servers that are reporting all sorts of impossible ... The values collected using tcpdump always seem realistic and match the ... Some ethernet drivers report stats from card itself, ...
    (Linux-Kernel)
  • Re: Wrong network usage reported by /proc
    ... servers that are reporting all sorts of impossible ... The values collected using tcpdump always seem realistic and match the ... Some ethernet drivers report stats from card itself, ...
    (Linux-Kernel)
  • Wrong network usage reported by /proc
    ... servers that are reporting all sorts of impossible ... network usage values through /proc, ... I download a DVD ISO from a remote workstation: ...
    (Linux-Kernel)
  • Re: Wrong network usage reported by /proc
    ... servers that are reporting all sorts of impossible ... The values collected using tcpdump always seem realistic and match the ... Some ethernet drivers report stats from card itself, ... otherwise I don't see the reason to delay their reporting. ...
    (Linux-Kernel)