This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

DHCP relay issue after upgrade to SFOS 17.0.1 MR-1

We are running our XG 210 on version 16 for a few months now. Our network configuration has 3 VLAN's. DHCP for couple of the VLANS is setup as relay to Windows Domain controller and for another VLAN was setup as XG Firewall. Everything was working like a charm till the firmware upgrade.

Yesterday I upgraded to the firmware to v17 and then MR-1. Since the upgrade the devices on DCHP relay VLAN's are not allocated IP address. I see a DHCP lease is created on Windows domain controller but IP is not allocated to the device.

I see posts where others had similar issues with previous versions, I am not exactly sure if any of those are applicable to my scenario. Any help in resolving is much appreciated.

- Kamal

Update 01

As mentioned in other posts with similar issues, I ran the following command and can see the entry for a MAC address matchine one of the devices.

console> drop-packet-capture

Date=2017-11-28 Time=10:16:08 log_id=0103021 log_type=Firewall log_component=Local_ACLs log_subtype=Denied log_status=N/A log_priority=Alert duration=N/A in_dev=Port1.300 out_dev= inzone_id=10 outzone_id=4 source_mac=38:a4:ed:67:41:25 dest_mac=ff:ff:ff:ff:ff:ff l3_protocol=IP source_ip=0.0.0.0 dest_ip=255.255.255.255 l4_protocol=UDP source_port=68 dest_port=67 fw_rule_id=0 policytype=0 live_userid=0 userid=0 user_gp=0 ips_id=0 sslvpn_id=0 web_filter_id=0 hotspot_id=0 hotspotuser_id=0 hb_src=0 hb_dst=0 dnat_done=0 proxy_flags=0 icap_id=0 app_filter_id=0 app_category_id=0 app_id=0 category_id=0 bandwidth_id=0 up_classid=0 dn_classid=0 source_nat_id=0 cluster_node=0 inmark=0x0 nfqueue=0 scanflags=0 gateway_offset=0 max_session_bytes=0 drop_fix=0 ctflags=0 connid=863434432 masterid=0 status=256 state=0 sent_pkts=N/A recv_pkts=N/A sent_bytes=N/A recv_bytes=N/A tran_src_ip=N/A tran_src_port=N/A tran_dst_ip=N/A tran_dst_port=N/A

This thread was automatically locked due to age.

Aditya Patel over 6 years ago in reply to Jose ManuelGonzalez

Hi Jose,

Could you run the command in Shell option 5>3

tcpdump -nei any port 67 or 68

Output:

tail -f /log/networkd.log

Output

ip route get <IPaddress of relay >

Output:

Regards,

Aditya Patel
Global Escalation Support Engineer | Sophos Technical Support
Knowledge Base | @SophosSupport | Sign up for SMS Alerts
If a post solves your question use the 'This helped me' link.
Cancel
Vote Up 0 Vote Down

Cancel
Jose ManuelGonzalez over 6 years ago in reply to Aditya Patel

Good morning, right now I have the firewall with the version XG310_WP01_SFOS 16.05.8 MR-8 To perform the test I would need to do it with version 17 and having it in production is almost impossible to do. I will wait for version 17.1 to be released and if it continues to fail I will be able to release the catch, thank you very much for your help, regards
Cancel
Vote Up 0 Vote Down

Cancel
Mathias Mühlbacher over 6 years ago in reply to Jose ManuelGonzalez

Hi there,

I am also having such issues.

DHCP relay via physical interface without VLAN works perfect.

If I am using DHCP via VLAN it either takes 5 - 10 seconds until a device gets the IP lease back or I do not get any lease.

I am using the newest version of XG -> SFOS 17.1.1 MR-1

Looking forward to a fix.
Cancel
Vote Up 0 Vote Down

Cancel
Jose ManuelGonzalez over 6 years ago in reply to Mathias Mühlbacher

Hello Mathias Mühlbacher, sorry for the delay but I have been on vacation. I tell you:
1.- I am waiting to update the Firewall. Currently through the graphical interface does not appear the possibility of downloading the latest update (now I have the SFOS 16.05.8 MR-8)
2.- Have you found any solution regarding the DHCP ?, I have an open ticket and the last option offered is to update and look through the Logs that occur with DHCP. Obviously in a production environment I have refused, I do not see it seriously, thank you

Greetings,
Cancel
Vote Up 0 Vote Down

Cancel
Mathias Mühlbacher over 6 years ago in reply to Jose ManuelGonzalez

Hello Jose Manuel Gonzalez,

no as it was not so vital for me I have deactivated the DHCP relay function for Sophos.

But I could do some test on the newest Sophos XG Version and get back to you.

Best regards,

Mathias
Cancel
Vote Up 0 Vote Down

Cancel
Mathias Mühlbacher over 6 years ago in reply to Mathias Mühlbacher

Hi there,

I have tested the DHCP relay function with my newest XG Version (SFOS 17.0.8 MR-8).
The result is that DHCP relay over LAN works while DHCP releay over VLAN is not working.
The log on my Linux server show DHCP offers and all those things but the tested device does still not get an IP address.

Best regards,
Mathias
Cancel
Vote Up 0 Vote Down

Cancel
Jose ManuelGonzalez over 6 years ago in reply to Jonas Stöhr

Hello Jonas, they found in support a solution to the problem. I still have the same problem even today, having updated to the latest version (SFOS 17.1.2 MR2). From support Spain tell me that it can be a problem of the switches. As a data if I pass a switch port to the same VLAN where my DHCP Server belongs, the client obtains IP address. I hope you can help me since I am a little desperate, thanks to everyone.
Cancel
Vote Up 0 Vote Down

Cancel

Jose ManuelGonzalez over 6 years ago in reply to Alan Tattersfield

Hello everyone, in the end the DHCP Relay problem was solved thanks to Alan Tattersfield's solution. Thank you very much Alan for the track you gave us. Really as you have exposed your solution is really logical that there should not be a Relay DHCP in the same VLAN where the DHCP Server resides. It was eliminated and customers began to acquire IP addresses. I hope that your idea can serve other users who are in the same situation. What really surprises me is how it could have been working all this time and when updating it stops working when it really should never have done it. Thank you

Samuel Heinrich over 5 years ago in reply to Aditya Patel

Hi Everyone

We are currently deploying 70 XG125 firewalls and run into the same issue.

Our setup is:

Port1 (nativ): Client VL1

Port1.10 Client VL2

Port1.20 Client VL3

Port9 ("unused with dummy IP")

Port9.110 Uplink / Default Gateway

DHCP Relay is setup on Port1, Port1.10, Port1.20 and relays to a server that is reachable over Port9.110 behind the default gateway.

We have noticed that after a reload of the XG DHCP Relay does not work properly anymore.

TCPdump showed us:

Port1: Receives Discover

Port9.110: relays Discover

Port9.110: receives offer

Port1: does not forward offer

We removed VL10 and VL20, so that only Port1 and Port9.110 is involed, still does not work.

During the tests we noticed, that if we re-apply the dhcp relay config in the GUI, everything works fine.

From the logfiles we can see, that just after the reload, the "is_relay_ok" function fails.

If I had to guess, i would say that after the reload the XG checks the relay, but maybe because Port9.110 is not up at that time, the relay server are marked dead and there is no periodic check to remark them alive?
Cancel
Vote Up 0 Vote Down

Cancel
Samuel Heinrich over 4 years ago in reply to Samuel Heinrich
This issue is now finally fixed with Version 18 with the introduction of this feature:

"DHCP Relay Enhancements for Dynamic Routing Synchronizes dynamic routing updates (learned routes from OSPF) to DHCP relay, eliminating the need for manual reconfiguration."

the problem in my case was, that the DHCP Server is behind the WAN interface (MPLS) and requires the default route to be installed in the routing table to be reachable. unfortunately during boot the DHCP Relay functions runs a check if the DHCP Server is reachable, but since default route is not installed at that moment, it fails. this check runs only one and marks the DHCP server dead.

A workaround for pre V18 to force the DHCP Relay function to constantly check for DHCP Server:

Mount the filesystem as read/write:

mount -no remount, rw /

Edit the startup scripts:

vi /scripts/system/clientpref/customization_application_startup.sh

Use vi to add the following line and save the file:

nservice networkd:dhcprelay_up -dsnosync

Write protect the filesystem again:

mount -no remount, ro /

To confirm the expected changes have applied, running cat on the startup script should now look like the following:

# cat /scripts/system/clientpref/customization_application_startup.sh
#!/bin/sh
nservice networkd:dhcprelay_up -dsnosync

exit 0;

you need to reapply the workaround after an upgrade within the 17.x train.

The bug is fixed with V18.
Cancel
Vote Up 0 Vote Down

Cancel