Re: dhcprelay troubleshooting, where next?

From: prg (rdgentry1_at_cablelynx.com)
Date: 05/25/05

  • Next message: Prasanna Krishnamoorthy: "Re: Connetion reset by peers (Socket Programming) in a Clinet Server Model ????"
    Date: 24 May 2005 18:48:20 -0700
    
    

    Andy Richardson wrote:
    > Hi folks,
    > after 4 days on this stupidly simple problem, I finally admit defeat.
    >
    > 1. (ted)192.168.2.102 is my dhcpd server.
    > It quite happily supplies addresses to everything on that subnet. It
    > also has a 192.168.1.0 subnet clause in the dhcpd.conf .
    >
    > 2. (mrsdoyle)The gateway 192.168.2.254(eth0): 192.168.1.254 (eth1) has
    > dhcrelay running.
    >
    > 3. The box(jack) I'm trying to assign, successfully collects an address
    > when put on the 192.168.2.0 subnet, on the 192.168.1.0 subnet.

    Thanks for the correction ;)
    So, you've successfully associated jack's MAC with a 192.168.2.z IP
    address. The server has noted this in its dhcpd.leases file. Ted will
    not be a happy camper seeing this MAC trying to get an address on
    another subnet (most likely).

    > My troubleshooting results so far:
    > A. /var/log/dhcpcd.log on jack, says that it times out waiting for a
    > valid server response.

    But what response(s), if any, does it get. Sniffed the wire on jack?

    > B. On mrsdoyle, tcpdump -i eth1 shows
    > 0.0.0.0.bootpc > 255.255.255.255.bootps etc.

    etc.s may be useful -- or maybe not. DHCP can be tricky to diagnose
    and I almost always inspect the entire packet sequence/exhange _with_
    payload. Ethereal very hand if available.

    Assume you have Windows boxes on the network.

    > arp who-has 192.168.2.102 tell 192.168.1.254
    > This tells me that the broadcast has made it to the gateway.
    > (but why is arp asking across subnets?)

    (several reasons these might show up, not least that 192.168.1.254 is
    the IP of the relay agent.)

    > C. On mrsdoyle, output from dhcrelay -d 192.128.2.102 is
    > Listening on Socket/eth1
    > Sending on Socket/eth1
    > Listening on Socket/eth0
    > Sending on Socket/eth0
    > forwarded BOOTREQUEST for (jack mac address) to 192.168.2.102
    > This tells me that the gateways has received the broadcast from jack
    > and is sending it to ted(the dhcp server)
    > cat /proc/sys/net/ipv4/ip_forward shows returns "1"

    Hunky, dorey.

    > D. tcpdump -i eth0 on mrsdoyle shows
    > arp who-has 192.168.2.102 tell 192.168.2.254
    > arp reply 192.168.2.102 is-at (ted's MAC address)
    > arp who-has 192.168.2.254 tell 192.168.2.102
    > arp reply 192.168.2.102 is-at (mrsdoyle eth0's MAC address)
    > plus lots of inaddr arp domain unreachable and PTR? stuff (I haven't
    > played with DNS yet - that's next)
    > NOTE: no mention of bootps or c.
    > This tells me that there is either ip_forwarding or a route table problem.

    mrsdoyle is not relaying on eth0. Any traffic at all from the relay
    agent to the server? Which options are you using with dhcrelay?
    Especially which -m option. forward?

    > If I give jack a static address I can ping across the gateway so
    > ip_forwarding is working.
    > The route table does include
    > route add 255.255.255.255 dev eth1
    > (in the same way that I need to do that to get the dhcpd server working.)

    I don't think ip_forward or route table is your problem. What about a
    firewall and UDP port 67?

    > I give up , I need help. Where next?
    > There is very little info on dhcrelay.
    > It is mostly mentioned as an afterthought in lots of howtos and
    > generally gives the impression that you just start it up and "Bob's your
    > aunties live-in lover", it all works.
    >
    > I'm sure it is embarrassingly simple so I'm typing quietly, but I just
    > hope someone can help.

    Viewing the _entire_ packet exchange and payload can often reveal the
    problem.

    When moving a box from one subnet to another, it's best to erase the
    client's leases file and the entry for that client from the server's
    dhcpd.leases file. There are ways to allow/correct for this, but
    usually only _after_ confirming correct simple behavior.

    Then there is the problem of arp caches that have jack's MAC/old IP
    entries that do/will conflict with jack's MAC/new IP assignment.
    Windows makes cleaning this up a pain sometimes. Try removing the
    faulty arp entries from ted and mrsdoyle at least.

    You may have to actually look at the RFC to get good info from the
    captured packet exchanges:

    http://www.faqs.org/rfcs/rfc951.html
    http://www.faqs.org/rfcs/rfc2131.html
    http://www.faqs.org/rfcs/rfc2132.html

    Since pinging and other goodies seem to work OK, I would look for a
    firewall/port problem and sniff the wire at _every_ nic along the
    pathway.

    hth,
    prg
    email aobe disabled


  • Next message: Prasanna Krishnamoorthy: "Re: Connetion reset by peers (Socket Programming) in a Clinet Server Model ????"

    Relevant Pages

    • Re: Migrating from one server to another
      ... Oh heck no. M$ clients and HA boxen on the same subnet? ... Migrating from one server to another ... stuff to figure out where gratuitous ARP is yet...). ... We've also had a problem on some servers where there are lots of aliases ...
      (AIX-L)
    • RE: arp
      ... YOu can try rebooting the server into Safe mode with Networking and if the ... compare the processes running vs the processes running in a ... >to broadcast allot of ARP for the whole subnet? ... It will bring the whole subnet down. ...
      (microsoft.public.win2000.security)
    • arp
      ... Is it normal for Windows 2000 server that is in NT domain as a member server ... to broadcast allot of ARP for the whole subnet? ... It will bring the whole subnet down. ...
      (microsoft.public.win2000.security)
    • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
      ... Once again, ARP is an RFC standard, if you are having to make static entries in unicast mode, then your network device is not in compliance. ... Windows Server 2008 Readiness Team ... I was feeling nervous about our teaming-capable adapter as I read it might be sending out heartbeats, so I disabled it AND configured the cluster on a separate DLink card in multicast mode. ... I thought that the litmus test was that the router functions fine when no NLB is installed, but when it is, things start going screwy. ...
      (microsoft.public.windows.server.clustering)
    • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
      ... ARP is defined by RFC. ... Windows Server 2008 Readiness Team ... The servers are a couple of switches away from the router so I would have thought that any duplicate MAC info. or similar would have been come irrelavent/concealed at the router. ... If you run the command 'wlbs query' and the node it is run on says it is converged with all of your nodes in the cluster, that basically says the NLB configuration is correct and the nodes are talking to each other. ...
      (microsoft.public.windows.server.clustering)