Machine cannot respond to NAT



We had a weird problem here a few days ago. Please tell if you have an idea of how it could happen.

publicnetwork---bigswitch---NATfirewall---smallswitch---internalnetwork
|
|
47

We have a machine called '47' attached to the bigswitch. (IP visible from the world)
At a certain moment it stopped responding to pings (and to ssh, http, everything) to computers inside internalnetwork.
However it was still responding to pings to computers outside the NAT.

We restarted all switches and the NAT many times, then we changed ethernet sockets in the bigswitch... No good.

I looked at the routing table of 47: there were a few extra entries (a few destination IPs routed to loopback) to blacklist some IPs from which we were receiving SSH attacks. This is a dynamic filter we have installed. However I removed those entries manually from the route and the problem persisted.

I looked at the iptables: it was empty as it was supposed to.

Then I started wireshark on the 47 and I could see the ping requests incoming from the internalnetwork machines, and the outgoing ping replies to such pings, going to the NAT.
So the replies were acutally generated, but somehow they were not reaching the internalnetwork.

I didn't know what to do anymore, so I restarted the 47.
To my surprise the pings started working again!!

I immediately checked the route table and the iptables tables: they were exactly like before the reboot.

Unfortunately I forgot to look at the arp cache of 47 before and after the restart.

Any idea of what could have happened!?!?

Thanks in advance
.



Relevant Pages