This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

utm 9 HA falls apart on dl380p g8 hyper-v

I thought it was finally time we give XG a chance (the last two times I've done this it's ended in tears as the platform is still WAY too green and lacking) - I figured it would finally be mature enough.  Maybe it is on sophos hardware, but sure doesn't seem to be under hyper-v!  Ran into nothing but problems on all hardware I tried.  Went back to UTM and had no trouble setting up an HA config on both poweredge r720s and r730s.  However when I tried to do the same on HP dl380p g8 boxes, I ran into trouble.  The individual units build up and function properly, but as soon as I put them into an HA pair (automatic config) their cpus spike to 100% and they become completely unresponsive.  I've rebuilt them multiple times to no avail - same behavior every time.  Any thoughts are welcome, thanks!



This thread was automatically locked due to age.
Parents
  • it occurred to me the other variable at the problem location is that it uses stacked netgear switches vs. hp switches at the other locations... so I shifted the VMs to run over a crossover cable between the two hosts and the problem did not occur... so I guess something about the switches is making the nodes wig out?  Will get the cabling shifted from crossovers to our other hp switches onsite when I get the chance to ensure the problem doesn't happen there as well.

  • Konichiwa - welcome to the UTM Community!

    Thanks for contributing the solution to teach others - I saw a similar situation here and at a client site within the last week.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob, so interestingly we have another site with the same Netgear stacked switches that has one Dell and one HP server. I did not experience the problem when I created an HA utm pair there last night, so now I’m not sure where the blame lies...

  • well actually it looks like it is happening, it just took much longer to manifest (at the other office it was pretty instantaneous).  with this one i was able to see the flapping/heartbeat lost/another master messages in the HA log and putty in to set the heartbeat keepalive from 3 to 10 (per one of your other threads).  went to bed and woke up to them being frozen with cpu out of control.  so maybe netgear is the problem after all.

  • Just a thought related to #7.7 in Rulz (last updated 2021-02-16) - what if you set a fixed speed & duplex on the 'Hardware' tab of 'Interfaces' for the NIC connected to the Netgear?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • ok this is getting very strange.  we set up a pair of PHYSICAL sg115 boxes at the original problem site with a home license to see if there was any difference.  it seemed fine for days and now today it is exhibiting the same problem as the virtual clusters - cpu going nuts on both nodes for no reason.  also the virtual one at the other site which had been working fine is doing it too.  ugh.

Reply
  • ok this is getting very strange.  we set up a pair of PHYSICAL sg115 boxes at the original problem site with a home license to see if there was any difference.  it seemed fine for days and now today it is exhibiting the same problem as the virtual clusters - cpu going nuts on both nodes for no reason.  also the virtual one at the other site which had been working fine is doing it too.  ugh.

Children