Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Security Heartbeat Functionality and Issues

I sent and email to my account manager and sales engineer and am awaiting a reply.  But it might be a few days.  I hope the community might help me out.

 

Have a fairly new XG 230 implementation at a school.  We are trying to use Security Heartbeat, but it seems pretty unusable.  Other implementations with smaller XGs and a dozen or fewer business users seem fine.

 

Here is why:

 

#1

Alerts are generated for offline computers.  My support gets a bunch of emails and they are not happy.

Is there a threshold that can be adjusted (say…match condition for x minutes before setting alert?) via CLI?

Is there something that can be done to suppress messages for offline computers.

 

#2

Had a machine that could not download updates.  Condition was red because it’s not up to date.  Can’t update because it is red.  However the machine could reach internal network resources.

 

I modified the LAN to WAN rule so it could access the internet if yellow (no restriction!).  Then it could update.  What’s the point though?

 

I observed the LAN to LAN rule is set to block if greater than yellow.  There must be something more to do here…LAN to LAN doesn’t route through XG for same subnet.

 

Then I found…

https://community.sophos.com/products/xg-firewall/f/initial-setup/82578/how-to-make-security-heartbeat-work

What do we think of the solutions here?

I created a rule with below FQDNs, put it on top, with no heartbeat restriction.  I see some traffic through the rule I created to allow traffic, but not much...Hasn't incremented for hours.

sophos.com

mojave.net

sophosupd.com

sophosupd.net

sophosxl.net

 

Then I called Support as I am anxious to hear something. 

I’m told that it’s normal that when a machine is shutdown, it will report and we will get an email message.  For a school, and with student laptops (awake, asleep, awake, asleep every class and sometimes several times in a class), this means each offline will be an alert (for 50 computers, say 400 messages a day might be typical).  This makes the product unmanageable from an alerting point of view.  Unable to see the fire through the weeds.

 

Here’s how I would like to see it work:

  • When a machine is turned off, the heartbeat will stop.   
  • Some machines take a while to shut down.
  • Make a failed heartbeat equal to no heartbeat plus a successful ping to the endpoint.
  • Have a CLI threshold where I can say I want 4 failed heartbeats prior to sending an alert.

 

On to the topic of isolating a machine that’s alerting…we spoke about this and I was told that a machine cannot be isolated from the local network.  It can only be isolated if traffic is flowing through the UTM (I think she meant XG).  I was under the understanding (and so were Brian and Garth) that a machine that has caused an alert would be blocked from all communication.  Is there someone that can explain this mechanism?

 

Here’s how I would like to see it work:

  • The XG communicates its rules to Central.  Why?  Because I have a rule that allows communication to WAN when Red to my Kaseya management server and to a random list of Sophos servers (see earlier email below) so they can update and so I can remote in and work on them (we are 2 or more hours away from some client sites).
  • The Endpoint alerts Central.  Rules in the XG block traffic based on the rules.  As this isn’t really effective though:
    • A network driver is installed on the endpoint and this filters network access based on the XG rules that allow communication, so an out of date machine might still be able to get updates and it can be blocked from local LAN communication.

 

Any input is appreciated so I can understand how this works, is supposed to work, or won't work.  I've turned it off for the moment for one customer.  Leaving it on for the little ones.

Regards,

David



This thread was automatically locked due to age.
Parents
  • Hi David,

    I can comment a little on how security heartbeat works:

    The most important aspect is that you must segment the traffic you want to "protect" via heartbeat rules.  So workstations should not be on the same network segment as your servers, etc.  This forces the traffic to traverse the XG and get processed for allow/deny rules (there is a new feature in development that will isolate endpoints, but that's down the road as I understand).

    In practice, the firewall, once registered with Central, knows what endpoints are managed and learns the IP endpoints will use for heartbeat.  Once the firewall has registered, Central will notify the clients at which point the endpoints communicate their heartbeat to the firewall directly via this magic IP (you can find it in the heartbeat.xml config file).

    Once this is done, the endpoint sends a heartbeat every 15 seconds or so.  The heartbeat is small and the firewall can process these packets pretty quickly.

    As you have learned, you must be aware of what rules you are writing and what you are blocking with respect to heartbeat rules.  We want to make sure that the endpoint can still communicate with Central and/or RMM tools.  How you handle that is really personal preference in my opinion.  I like to allow traffic for automated incident cleanup but I have seen others that block it and use jump boxes for remote troubleshooting/cleanup.

    Hope that helps a little on how the endpoints communicate their heartbeat to the firewall.

    Cheers.

Reply
  • Hi David,

    I can comment a little on how security heartbeat works:

    The most important aspect is that you must segment the traffic you want to "protect" via heartbeat rules.  So workstations should not be on the same network segment as your servers, etc.  This forces the traffic to traverse the XG and get processed for allow/deny rules (there is a new feature in development that will isolate endpoints, but that's down the road as I understand).

    In practice, the firewall, once registered with Central, knows what endpoints are managed and learns the IP endpoints will use for heartbeat.  Once the firewall has registered, Central will notify the clients at which point the endpoints communicate their heartbeat to the firewall directly via this magic IP (you can find it in the heartbeat.xml config file).

    Once this is done, the endpoint sends a heartbeat every 15 seconds or so.  The heartbeat is small and the firewall can process these packets pretty quickly.

    As you have learned, you must be aware of what rules you are writing and what you are blocking with respect to heartbeat rules.  We want to make sure that the endpoint can still communicate with Central and/or RMM tools.  How you handle that is really personal preference in my opinion.  I like to allow traffic for automated incident cleanup but I have seen others that block it and use jump boxes for remote troubleshooting/cleanup.

    Hope that helps a little on how the endpoints communicate their heartbeat to the firewall.

    Cheers.

Children
  • Thanks Axsom1.  I appreciate your response.

     

    Sophos literature tells that an endpoint under duress (my words, not theirs) will be isolated.  So if a machine needs an update, it's cutoff from getting updates.  Catch-22.  Most small businesses and schools don't run their segments through their firewall.  I could do that in this situation, but needs a few hundred feet of cabling as the firewall isn't near the server closet.  I could VLAN it, but that doubles the traffic on the existing link due to the existing network layout.

     

    I'm not having any issue with getting heartbeat to the XG.  Issue is when a machine is turned off, we still get alerts.  My guess is the client stops sending heartbeat as it shuts down (because Windows shutdown can take up to a minute or so.  Even longer if Updates need to be installed).  During the shutdown, there is no heartbeat, but the machine is still up.  My guess.  Would be nice to say, have 4 missing heartbeats before sending an alert email instead of beating my support desk up with hundreds of unnecessary alerts.

    My solution is to turn off Security Heartbeat in the XG until Sophos can properly address this.  I'll use it for the 12 user networks.  But anything over that, forget it.

     

  • David,

    I agree with your point of view. If the computer is shutting down the heartbeat mechanism should allow some time or dead ping before considering that the PC is not healthy. A better communication and delay management should be performed. Open a feature request and post the link here so other users can vote it.

    As Axsom1 suggested, having segmentation is one of the rule of building a secure network. At least, guest, company and server must be separated using VLAN. It takes a while but you will have several benefits afterward. You can put IPS, another firewall between vlan, etc..

    Do not confuse Sophos Hearbeat with NAC product. At the moment HB can block only computer communications across firewall, for example between LAN to WAN or from LAN to another LAN (only across Layer 3).

    Into 2018, Sophos will extend HB by totally blocking the unhealthy computers from the rest of the network (even on same LAN). This technology is called Stonewalling.

    I would recommend you to segment the network for future improvements (even if you will move to another brand). The other option is to have a look at NAC products if you need granular control.

    Regards

  • Thanks for the reply lferrara.  

    The network is at a school we manage remotely several hours away.  Picked them up a couple months ago.  Poor architecture by their previous IT company is why there is no segmentation of their network.  I'll look at making changes to that recommendation when I go onsite to remove some of the 9 switch hops between server and endpoints (in a school district with less than 200 students??) and clean up a few other design issues - I'm limited to 4 hours or so onsite though because of the commute and agreed billable time.  Think I'll build out a second vSwitch in their server, connect to a new VLAN and migrate over.  Just want to be onsite when setting untagged connection to XG in the event something becomes inaccessible.

    I didn't see anything with HB blocking across firewall in Sophos literature.  Others here where I work thought the same that I did - that the client was already capable of stonewalling when working in concert with the XG.  Picked up from a conference and other sources.  I spoke with Paul Zindell and he explained the entire thing to me, so now we all know.

  • yes, the missing heartbeat detection could be improved.  I understand why they classify it as red as some compromise attempt would disable services.  What are you using for reporting/alerting?  SFM or on-box?

    I wonder if you could filter the missing notifications from the red?  If missing and endpoint is online, I would guess that the end user would notify IT since they would be denied.  We have been slower to roll this to K12 clients, so no practical experience to relay here.

  • As this last replay is from 2017!!!!

     

    We still have these problems!

    I set up a new XG230, several clients... some segmentation... all works fine. Our Workstations have no problems...

     

    ...but when a laptop switches from cable-firstLAN to wireless-secondLAN, or even when they switch off... the XG detects "missing heartbeats".

    Later they comes up, all is fine.

    And also, is saw that a client was shut down, and several minutes (maybe a hour) later, it detects the missing heartbeat (but client is off for a while!).

    So, can anyone confirm, this problem still exist?

  • Hi,

    we have exactly the same issue with Sophos Endport Heartbeat and XG230.

    Every time our laptops switches to offline, f.e. shut down or sleep mode we get the missing heartbeat email.

    Does anyone have a solution?

    Kind regards
    FZB

  • I have been experiencing the same issue for over a year now.

  • Hi.
    I want to let you know that I had a nice dialogue with a Sophos Support employee in the last couple of weeks and I'm glad to tell you that for me the problem is gone.

    To sum up what we've did:
    We disabled Synchronized Application Control on the XG230 and enabled it again after two weeks.

    And yes, that was actual everything we did.

    Kind regards
    FZB