Hello everyone, I'm a new user of Sophos XG v16 and I'm having a little bit of trouble when it comes to communication across VLANs. I actually think I'm 90% of the way there, but there's something not quite right about my configuration and I'm looking for some help.
First a bit about my network topology. I have the XG Firewall at the top, connected to the WAN on port 2 and my core LAN switch on port 1. My core LAN switch (28-port TP-LINK, Level 2) is where all the pre-wired CAT6 around the house homeruns back to. One of the devices hanging off that switch is my primary workstation. I have a second switch (16-port TP-Link, Level 2) in a server closet, with LACP LAG running between the two switches. My server is connected to the second switch, also via LACP LAG. All of the above is working fine (with several other devices hanging off, WAP, etc.). I previously tested it as a flat network, 10.0.0.0/8, with everything on the default VLAN of 1.
Last night I backed up my config and started implementing my VLAN design, which involves putting my PCs, servers and peripherals on VLAN 10, then separating other kinds of devices onto other VLANs. For example IP Cameras on 20, VOIP/Intercom on 30, etc. However, my primary workstation needs to be able to communicate on multiple VLANs. For example it needs to talk to the servers and printers on VLAN 10, but also be able to configure the switches and routers on VLAN 1. Because all of my managed switches are Level 2 only, the inter-VLAN routing needs to be done in the XG firewall.
I started out slow last night, only changing the one server over to VLAN10, leaving everything else alone. To do this I first changed the XG, port 1, to 10.1.1.1/16 instead of the original /8. Then I configured the LAG ports between the server and its switch as untagged with PVID 10, and the LAG ports between switches and the port between the core switch and router as tagged for VLAN 10. The workstation remained on an untagged port with PVID 1. This was to ensure that no matter how bad I screwed things up the workstation could still get to the switches and router to fix things :) In XG, I then created VLAN 10 as a new interface, Port 1.10, configured for IP 10.10.1.1/16. Both ports have their own DHCP domain assigned.
After doing all this and bouncing the server NICs, it obtained an address via DHCP on the 10.10.x.x subnet, which I believe confirms the switches are configured correctly for trunking the VLANs, and the DHCP server for Port 1.10 is working. It was also able to access the Internet (ping 8.8.8.8) via the pre-existing "Allow all LAN to WAN" firewall rule. So far so good, and at this stage I could no longer ping from the server (VLAN 10) to the workstation (VLAN 1) or vice-versa, another indication things are working as desired.
The next thing I did was to create two IP objects. One for the Workstation corresponding to its static DHCP reservation, and the other called VLAN_10 for the 10.10.0.0/16 network. I then used the objects in a new firewall rule to allow all services with a source of "Workstation" on the LAN zone to the destination of VLAN_10 on the LAN zone. At this point I could ping from the workstation to the server and receive replies, and could not ping from the server to the workstation. This is exactly what I expected which led me to believe that I'd succeeded, however I soon found out this was not the case. :(
The first area I noticed trouble was in the Samba shares I had mapped from the server on the workstation. They weren't connected because they had been originally created using the NetBios(?) name, ie: \\Server\Media. Rather than figure out how to resolve those names over the VLANs, I simply remapped them as \\10.10.x.x\Media and they came back online. Unfortunately, the connection to these shares is unstable and unusable. Attempting to browse through the share hangs up file explorer for 20-30 seconds each time you try to open a folder. If I attempt to play a video file I'll get 10-15 seconds of good video and then it will freeze and an error message will come up about having lost the connection.
The second area I noticed trouble was attempting to SSH from the workstation into the server. I know PuTTY makes an initial connection, as I get the "Using username ____" message, and the "Password for _____:" prompt. However, as soon as I put in my password and press enter it hangs for several seconds, then PuTTY pops up a window saying something about the software being disconnected.
So long story short, it seems like I have basic "connectivity" between my VLANs, since ping works and SSH starts to build up a connection before it dies. Unfortunately I don't have any kind of usable communication between the VLANs. Here are a couple things that I already tried / looked at:
1.) I tried to create another firewall rule, identical to the one above, except with VLAN_10 as the source and Workstation as the destination. This allowed me to ping the other direction, from server to workstation, but I don't think it affected the Samba or SSH issues. In theory this rule shouldn't be needed (?) since XG should automatically allow replies to established connections (as demonstrated by successful ping replies before even creating this rule)?
2.) I checked CPU & memory utilization on the XG. CPU was in the single digits, memory around 15%. It's a c2758 Atom, 8-core, 2.4GHz, 16GB RAM. I'm sure it's not a routing performance issue, it has to be something with the settings/rules...
3.) I checked the firewall logs, looking for any kind of drop message (though I have no "drop" rules, just the LAN-to-WAN accept and the Workstation-to-VLAN_10 accept rules). I saw nothing in the logs except for messages allowed from Workstation IP to server IP, TCP service and port whatever. No indication of return packets being either accepted or dropped, presumably because I have no specific rule telling it to log those packets. Should I / can I create some extra rules for the sole purpose of logging?
I apologize for the long post but I wanted to try to cover some of the basics of things I've already done that I believe are working. I feel like this has to be some noob error where I overlooked a small step, because everything is "almost" working, I just can't think of what that step would be... If I can figure this out, I should be able to just repeat the recipe for all my other ports and VLANs and be golden. Thanks in advance to anyone with an idea or two for me to try.
This thread was automatically locked due to age.