This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

utm 9 HA falls apart on dl380p g8 hyper-v

I thought it was finally time we give XG a chance (the last two times I've done this it's ended in tears as the platform is still WAY too green and lacking) - I figured it would finally be mature enough.  Maybe it is on sophos hardware, but sure doesn't seem to be under hyper-v!  Ran into nothing but problems on all hardware I tried.  Went back to UTM and had no trouble setting up an HA config on both poweredge r720s and r730s.  However when I tried to do the same on HP dl380p g8 boxes, I ran into trouble.  The individual units build up and function properly, but as soon as I put them into an HA pair (automatic config) their cpus spike to 100% and they become completely unresponsive.  I've rebuilt them multiple times to no avail - same behavior every time.  Any thoughts are welcome, thanks!



This thread was automatically locked due to age.
  • it occurred to me the other variable at the problem location is that it uses stacked netgear switches vs. hp switches at the other locations... so I shifted the VMs to run over a crossover cable between the two hosts and the problem did not occur... so I guess something about the switches is making the nodes wig out?  Will get the cabling shifted from crossovers to our other hp switches onsite when I get the chance to ensure the problem doesn't happen there as well.

  • Konichiwa - welcome to the UTM Community!

    Thanks for contributing the solution to teach others - I saw a similar situation here and at a client site within the last week.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob, so interestingly we have another site with the same Netgear stacked switches that has one Dell and one HP server. I did not experience the problem when I created an HA utm pair there last night, so now I’m not sure where the blame lies...

  • well actually it looks like it is happening, it just took much longer to manifest (at the other office it was pretty instantaneous).  with this one i was able to see the flapping/heartbeat lost/another master messages in the HA log and putty in to set the heartbeat keepalive from 3 to 10 (per one of your other threads).  went to bed and woke up to them being frozen with cpu out of control.  so maybe netgear is the problem after all.

  • Just a thought related to #7.7 in Rulz (last updated 2021-02-16) - what if you set a fixed speed & duplex on the 'Hardware' tab of 'Interfaces' for the NIC connected to the Netgear?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • ok this is getting very strange.  we set up a pair of PHYSICAL sg115 boxes at the original problem site with a home license to see if there was any difference.  it seemed fine for days and now today it is exhibiting the same problem as the virtual clusters - cpu going nuts on both nodes for no reason.  also the virtual one at the other site which had been working fine is doing it too.  ugh.

  • Did you try #7.7 as I suggested in my previous post here?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • no as it doesn't apply since the netgear is not in the picture any more.  the virtual sophoses at the problem site were moved to hp switches (identical to our working site) but still had the problem.  so we set up physical sophoses and today they wigged out too.