arp resolution issue



Hi all,

Bit of a long one here

We have an ibm blade center with 2 blades running rhel3 update 4 (plus
some aix blades) last week 2 of the blades disappeared off the network.
I started a remote control session and noted that the lan cards were
up, with no errors in the messages and kernel messages file. I could
ping the card internally but couldn't ping out. The blade centers have
internal d-link switches which are then connected to our customers lan.
They have 2 on board ethernet cards, eth0 through on switch and eth1
through the other. When I moved the ip address to eth1, all was ok.
This is the same for both blades ! I placed a call with IBM as I
initially thought that 2 servers of the same type (HS20 blades) having
the same issue, it must be hardware. We first of all went through
updating bios, broadcom, ethernet and management module firmware before
they agreed to send out an engineer. (nothing in any of the hardware
logs by the way) 1 blade now has a new system board, there are 2 new
switches in the center, still we can't ping the switch when connected
through eth0. If i run tcpdump against eth0 I can see passing traffic
etc. When I ping the switch I can see the arp request but no reply. The
entry in the arp table is incomplete. If I check the arp table on the
switch I see the mac address of eth0 on the external port, not the
internal port. The reply is going out of the wrong port. If I manually
update the arp table on the server with the correct details of the
switch/server I am trying to ping, all is ok. We moved one of the
servers into another blade center and we get the same issue !! We don't
seem to be able to pinpoint exactly where the problem is. I asked the
comms guy on site whether he could see anything advertising itself with
the same mac addresss (unlikely) and whether any changes were made
(confirmed not) Myself and IBM are now at a loss as per what we can do
next.

Any Ideas ?

Thanks in advance, and for spending the time reading this thread

Steven

.



Relevant Pages

  • arp resolution issue
    ... some aix blades) last week 2 of the blades disappeared off the network. ... ping the card internally but couldn't ping out. ... They have 2 on board ethernet cards, eth0 through on switch and eth1 ... When I ping the switch I can see the arp request but no reply. ...
    (comp.os.linux.networking)
  • Re: Tech EM: Gottlieb El Dorado score reels sluggish
    ... Also to adjust the switch blades so they JUST contact their 'stopper' ... the wiper for the 0--9 disc on units with the wiper / disc. ...
    (rec.games.pinball)
  • Re: Tech EM: Gottlieb El Dorado score reels sluggish
    ... Also to adjust the switch blades so they JUST contact their 'stopper' ... tension of the wiper should be very light, ...
    (rec.games.pinball)
  • weird network problem with B100s/B1600 chassis
    ... We've got a Sun Blade B1600 chassis full of servers. ... Recently, two of the blades failed, and we bought two new ones. ... I've tried logging in to the switch module. ...
    (SunManagers)
  • B1600 Chassis network question
    ... Basically, in our current network topology, each server has 4 interfaces, ... to another switch for the network in case we lose a switch or NIC. ... and that these can operate in a redundant fashion; ... to support more than 8 blades in a dual-homed, ...
    (comp.sys.sun.hardware)