Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

DHCP relay issue after upgrade to SFOS 17.0.1 MR-1

We are running our XG 210 on version 16 for a few months now. Our network configuration has 3 VLAN's. DHCP for couple of the VLANS is setup as relay to Windows Domain controller and for another VLAN was setup as XG Firewall. Everything was working like a charm till the firmware upgrade.

Yesterday I upgraded to the firmware to v17 and then MR-1. Since the upgrade the devices on DCHP relay VLAN's  are not allocated IP address. I see a DHCP lease is created on Windows domain controller but IP is not allocated to the device.

I see posts where others had similar issues with previous versions, I am not exactly sure if any of those are applicable to my scenario. Any help in resolving is much appreciated.

- Kamal

 

Update 01

As mentioned in other posts with similar issues, I ran the following command and can see the entry for a MAC address matchine one of the devices.

console> drop-packet-capture

Date=2017-11-28 Time=10:16:08 log_id=0103021 log_type=Firewall log_component=Local_ACLs log_subtype=Denied log_status=N/A log_priority=Alert duration=N/A in_dev=Port1.300 out_dev= inzone_id=10 outzone_id=4 source_mac=38:a4:ed:67:41:25 dest_mac=ff:ff:ff:ff:ff:ff l3_protocol=IP source_ip=0.0.0.0 dest_ip=255.255.255.255 l4_protocol=UDP source_port=68 dest_port=67 fw_rule_id=0 policytype=0 live_userid=0 userid=0 user_gp=0 ips_id=0 sslvpn_id=0 web_filter_id=0 hotspot_id=0 hotspotuser_id=0 hb_src=0 hb_dst=0 dnat_done=0 proxy_flags=0 icap_id=0 app_filter_id=0 app_category_id=0 app_id=0 category_id=0 bandwidth_id=0 up_classid=0 dn_classid=0 source_nat_id=0 cluster_node=0 inmark=0x0 nfqueue=0 scanflags=0 gateway_offset=0 max_session_bytes=0 drop_fix=0 ctflags=0 connid=863434432 masterid=0 status=256 state=0 sent_pkts=N/A recv_pkts=N/A sent_bytes=N/A recv_bytes=N/A tran_src_ip=N/A tran_src_port=N/A tran_dst_ip=N/A tran_dst_port=N/A



This thread was automatically locked due to age.
  • Hi Jose,

    Could you run the command in Shell option 5>3 

    tcpdump -nei any port 67 or 68

    Output:

    tail -f /log/networkd.log

    Output

    ip route get <IPaddress of relay >

    Output:

    Regards,

    Aditya Patel
    Global Escalation Support Engineer | Sophos Technical Support

    Knowledge Base  |  @SophosSupport | Sign up for SMS Alerts
    If a post solves your question use the 'This helped me' link.

  • Good morning, right now I have the firewall with the version XG310_WP01_SFOS 16.05.8 MR-8 To perform the test I would need to do it with version 17 and having it in production is almost impossible to do. I will wait for version 17.1 to be released and if it continues to fail I will be able to release the catch, thank you very much for your help, regards

  • Hi there,

     

    I am also having such issues.

    DHCP relay via physical interface without VLAN works perfect.

    If I am using DHCP via VLAN it either takes 5 - 10 seconds until a device gets the IP lease back or I do not get any lease.

     

    I am using the newest version of XG -> SFOS 17.1.1 MR-1

     

    Looking forward to a fix.

  •  
    Hello Mathias Mühlbacher, sorry for the delay but I have been on vacation. I tell you:
    1.- I am waiting to update the Firewall. Currently through the graphical interface does not appear the possibility of downloading the latest update (now I have the SFOS 16.05.8 MR-8)
    2.- Have you found any solution regarding the DHCP ?, I have an open ticket and the last option offered is to update and look through the Logs that occur with DHCP. Obviously in a production environment I have refused, I do not see it seriously, thank you
     
    Greetings,
  • Hello Jose Manuel Gonzalez,

     

    no as it was not so vital for me I have deactivated the DHCP relay function for Sophos.

    But I could do some test on the newest Sophos XG Version and get back to you.

     

    Best regards,

    Mathias

  • Hi there,
    I have tested the DHCP relay function with my newest XG Version (SFOS 17.0.8 MR-8).
    The result is that DHCP relay over LAN works while DHCP releay over VLAN is not working.
    The log on my Linux server show DHCP offers and all those things but the tested device does still not get an IP address.
    Best regards,
    Mathias
  • Hello Jonas, they found in support a solution to the problem. I still have the same problem even today, having updated to the latest version (SFOS 17.1.2 MR2). From support Spain tell me that it can be a problem of the switches. As a data if I pass a switch port to the same VLAN where my DHCP Server belongs, the client obtains IP address. I hope you can help me since I am a little desperate, thanks to everyone.

  • Hello everyone, in the end the DHCP Relay problem was solved thanks to Alan Tattersfield's solution. Thank you very much Alan for the track you gave us. Really as you have exposed your solution is really logical that there should not be a Relay DHCP in the same VLAN where the DHCP Server resides. It was eliminated and customers began to acquire IP addresses. I hope that your idea can serve other users who are in the same situation. What really surprises me is how it could have been working all this time and when updating it stops working when it really should never have done it. Thank you
  • Hi Everyone

    We are currently deploying 70 XG125 firewalls and run into the same issue.

    Our setup is:

    Port1 (nativ): Client VL1

    Port1.10 Client VL2

    Port1.20 Client VL3

     

    Port9 ("unused with dummy IP")

    Port9.110 Uplink / Default Gateway

     

    DHCP Relay is setup on Port1, Port1.10, Port1.20 and relays to a server that is reachable over Port9.110 behind the default gateway.

    We have noticed that after a reload of the XG DHCP Relay does not work properly anymore. 

    TCPdump showed us:

    Port1: Receives Discover

    Port9.110: relays Discover

    Port9.110: receives offer

    Port1: does not forward offer

     

    We removed VL10 and VL20, so that only Port1 and Port9.110 is involed, still does not work.

    During the tests we noticed, that if we re-apply the dhcp relay config in the GUI, everything works fine.

    From the logfiles we can see, that just after the reload, the "is_relay_ok" function fails.

    If I had to guess, i would say that after the reload the XG checks the relay, but maybe because Port9.110 is not up at that time, the relay server are marked dead and there is no periodic check to remark them alive?

     

  • This issue is now finally fixed with Version 18 with the introduction of this feature:

    "DHCP Relay Enhancements for Dynamic Routing Synchronizes dynamic routing updates (learned routes from OSPF) to DHCP relay, eliminating the need for manual reconfiguration."

     

    the problem in my case was, that the DHCP Server is behind the WAN interface (MPLS) and requires the default route to be installed in the routing table to be reachable. unfortunately during boot the DHCP Relay functions runs a check if the DHCP Server is reachable, but since default route is not installed at that moment, it fails.  this check runs only one and marks the DHCP server dead. 

     

     

    A workaround for pre V18 to force the DHCP Relay function to constantly check for DHCP Server:

     


    1. Mount the filesystem as read/write:

      mount -no remount, rw /

    2. Edit the startup scripts:

      vi /scripts/system/clientpref/customization_application_startup.sh

    3. Use vi to add the following line and save the file:

      nservice networkd:dhcprelay_up -dsnosync

    4. Write protect the filesystem again:

      mount -no remount, ro /

     

    To confirm the expected changes have applied, running cat on the startup script should now look like the following:


    # cat /scripts/system/clientpref/customization_application_startup.sh
    #!/bin/sh
    nservice networkd:dhcprelay_up -dsnosync

    exit 0;

     

     

    you need to reapply the workaround after an upgrade within the 17.x train.

    The bug is fixed with V18.