DHCP relay issue after upgrade to SFOS 17.0.1 MR-1

We are running our XG 210 on version 16 for a few months now. Our network configuration has 3 VLAN's. DHCP for couple of the VLANS is setup as relay to Windows Domain controller and for another VLAN was setup as XG Firewall. Everything was working like a charm till the firmware upgrade.

Yesterday I upgraded to the firmware to v17 and then MR-1. Since the upgrade the devices on DCHP relay VLAN's  are not allocated IP address. I see a DHCP lease is created on Windows domain controller but IP is not allocated to the device.

I see posts where others had similar issues with previous versions, I am not exactly sure if any of those are applicable to my scenario. Any help in resolving is much appreciated.

- Kamal

 

Update 01

As mentioned in other posts with similar issues, I ran the following command and can see the entry for a MAC address matchine one of the devices.

console> drop-packet-capture

Date=2017-11-28 Time=10:16:08 log_id=0103021 log_type=Firewall log_component=Local_ACLs log_subtype=Denied log_status=N/A log_priority=Alert duration=N/A in_dev=Port1.300 out_dev= inzone_id=10 outzone_id=4 source_mac=38:a4:ed:67:41:25 dest_mac=ff:ff:ff:ff:ff:ff l3_protocol=IP source_ip=0.0.0.0 dest_ip=255.255.255.255 l4_protocol=UDP source_port=68 dest_port=67 fw_rule_id=0 policytype=0 live_userid=0 userid=0 user_gp=0 ips_id=0 sslvpn_id=0 web_filter_id=0 hotspot_id=0 hotspotuser_id=0 hb_src=0 hb_dst=0 dnat_done=0 proxy_flags=0 icap_id=0 app_filter_id=0 app_category_id=0 app_id=0 category_id=0 bandwidth_id=0 up_classid=0 dn_classid=0 source_nat_id=0 cluster_node=0 inmark=0x0 nfqueue=0 scanflags=0 gateway_offset=0 max_session_bytes=0 drop_fix=0 ctflags=0 connid=863434432 masterid=0 status=256 state=0 sent_pkts=N/A recv_pkts=N/A sent_bytes=N/A recv_bytes=N/A tran_src_ip=N/A tran_src_port=N/A tran_dst_ip=N/A tran_dst_port=N/A

  • Hi,

    have you figured out a solution for this problem yet?

    I've got pretty much the same issue here, except that only the clients reside in a VLAN whereas the DHCP server is in the untagged LAN. There's one relay for the untagged LAN port pointing to one server (which presumably works, although I cannot verify it's routed through the relay, as the DHCP server is reachable by itself) and another one for the VLAN port, which also forwards to another server in the LAN zone.

    Looking at the logs of the second server, responsible for serving clients from the VLAN only, the DISCOVER is forwarded and an OFFER is generated and sent back to the relay. However, neither an ACK or a NAK message is to be found afterwards. Only if I run another DHCP server directly on the VLAN the NAKs are forwarded, as the clients already get an IP from another server.

    Sincerely,

    Jonas

  • In reply to Jonas Stöhr:

    Hi Jonas,

    Not much luck with the issue yet. I have raised a ticket with Sophos support to help out with troubleshooting.

     

    -Kamal

  • In reply to Kamalakar Nellipudi:

    Hi all, I also have this. Confirmed by support as know issue (BUG ID NC-20755). The WA provided was to reconfigure the DHCP Relay option on every Vlan and restart XG. I am waiting to check if the WA works. I think it does not make any sense, to announce the MR1 knowing in advance this kind of problems.
  • In reply to Kamalakar Nellipudi:

    We gave up perhaps it was too early to try it. Got our sophos partner to rollback firmware to SFOS 16.05.8 MR-8 and everything works like a charm again :(

    - Kamal

  • In reply to PeterRL:

    Thanks Peter, unfortunately as our network was seriously crippled for a week we decided to rollback to 16 MR-8. Will take a note of the workaround if we face this issue if we decide to upgrade again.

     

    Regards

    Kamal 

  • I was running DHCP Relay in version 16 with no issues.  I upgraded to version 17 and DHCP Relay stopped working.  After reviewing my config, I noticed the DHCP Relay was enabled on the interface my DHCP server was running on.  After I removed the DHCP Relay from the network interface my DHCP server was in, all the other network DHCP Relays to the DHCP server started working again.  Since the DHCP server is in the network it really does not need a DHCP Relay agent on the interface.  I tested several times to confirm this is the case.

     

    Example:

    If DHCP Server is 192.168.10.10 on Network 192.168.10.1/24 adding a DHCP Relay on this network for the server will cause all DHCP Relays to the server to stop working.

     

    I hope this helps anyone in the same situation.

     

    Alan 

  • In reply to Alan Tattersfield:

    Thank you Alan. My setup was what you have described. I upgraded to SFOS 17.0.6 MR-6 and removed DHCP relay on subnet where DHCP server is located and everything is working perfectly so far.

     

    Regards

    Kamal

  • In reply to Kamalakar Nellipudi:

    Exactly the same thing happened to me, this week I decided to update the firewall from SFOS version 16.05.8 MR-8 to SFOS 17.0.8 MR-8. I have a network segmented in VLAN and everything worked fine until when updating, half of the computers went into network and the other half did not (dhcp server windows did not serve IP address). I have tried to do what Alan says (remove the DHCP relay from the subnet where i have the DHCP Server) but it has not worked. I have an open ticket with support but in the end I decided to start the firewall with the image SFOS 16.05.8 MR-8 and everything has returned to normal.

  • In reply to Jose ManuelGonzalez:

    Hi Jose,

    Could you try running the command under shell

    Go to Option 5>3

    #dhcrelay -i Port1.10 <DHCP Relay> -4 -d     for Vlan 10 on Port1

  • In reply to Aditya Patel:

     

    Hi Aditya Patel, 

    Mention that my network has several VLANs (VLAN 37-45). The DHCP Server is located on VLAN37

    I have released the command knowing that my DHCP Server is in LAG.0.37 and I get the following:

    XG310_WP01_SFOS 16.05.8 MR-8# dhcrelay -i Lag0.37 192.168.37.254 -4 -d
    Internet Systems Consortium DHCP Relay Agent 4.2.4-P2
    Copyright 2004-2012 Internet Systems Consortium.
    All rights reserved.
    For info, please visit https://www.isc.org/software/dhcp/
    INFO: recv_netlink_req: 3680 bytes received

    INFO: recv_netlink_req: 3764 bytes received

    INFO: recv_netlink_req: 2536 bytes received

    INFO: recv_netlink_req: 2616 bytes received

    INFO: recv_netlink_req: 3768 bytes received

    INFO: recv_netlink_req: 3572 bytes received

    INFO: recv_netlink_req: 2572 bytes received

    INFO: recv_netlink_req: 2740 bytes received

    INFO: recv_netlink_req: 2576 bytes received

    INFO: recv_netlink_req: 2576 bytes received

    INFO: recv_netlink_req: 2576 bytes received

    INFO: recv_netlink_req: 2576 bytes received

    INFO: recv_netlink_req: 20 bytes received

    NLMSG_DONE received

    Number of interfaces: <1>
    check_relay_port: index '20' found for Lag0.37

    INFO: recv_netlink_req: 96 bytes received

    find_route: index '20' found for 192.168.37.254

    is_relay_ok: Lag0.37 is out interface for 192.168.37.254, aborting registration

    Listening on LPF/Lag0.37/00:1a:8c:5c:c2:f4
    Sending on   LPF/Lag0.37/00:1a:8c:5c:c2:f4
    Sending on   Socket/fallback
    5 bad udp checksums in 5 packets
    Packet to bogus giaddr 192.168.38.10.

    Thank,
  • In reply to Jose ManuelGonzalez:

    Hi Jose,

    Could you run the command in Shell option 5>3 

    tcpdump -nei any port 67 or 68

    Output:

    tail -f /log/networkd.log

    Output

    ip route get <IPaddress of relay >

    Output:

  • In reply to Aditya Patel:

    Good morning, right now I have the firewall with the version XG310_WP01_SFOS 16.05.8 MR-8 To perform the test I would need to do it with version 17 and having it in production is almost impossible to do. I will wait for version 17.1 to be released and if it continues to fail I will be able to release the catch, thank you very much for your help, regards