This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Slave node stuck in up2date state

Slave node stuck in up2date state. Reboot does not solve the problem.

MASTER: 2 Node2 198.19.250.2 9.403004 ACTIVE
SLAVE: 1 Node1 198.19.250.1 9.356003 UP2DATE

Login to slave node failed due of permission denied. Changing ssh passwords for slave via Webadmin does not work. 

Possible Reason:
version of package '/var/up2date/sys/u2d-sys-9.356003-404005.tgz.gpg' doesn't fit, skipping



This thread was automatically locked due to age.
  • Hi Technik.

    You're saying that you are not able to ssh into your slave node using ha_utils ssh?

    If so, you are in a bit of a pickle here. I would suggest contacting support ASAP if this is a commercial license. Another course of action would be to destroy your cluster, upgrade both nodes to the same version and rebuild the cluster. Don't worry about loosing anything, when you destroy the cluster, it will automatically factory reset and shutdown your slave node and your master node will remain intact.

    Just make sure that, if you decide to rebuild your cluster, that your master node uptime is higher than your slave node, otherwise your slave node might end up being selected as master and wiping out the master node configuration, logs and reports. Just reboot the slave node a few times before joining it to the cluster and Bob's your uncle.

    Regards - Giovani

  • Hi, and welcome to the UTM Community!

    First, you should get Sophos Support involved.

    That said, if you want to do this yourself, I would try the following:

    1. Power down the Slave.
    2. Up2Date the Master to 9.404 (This will cause a reboot and about 5 minutes of down time.)
    3. Power up the Slave.  It should Up2Date itself and the Master should sync to it.

    If that doesn't solve the problem, you will need to disconnect the Slave, re-image it, bring it up to the same version as the Master, power it down, re-connect it and power it back up.

    Support needs to know about this though, so I reiterate the suggestion to get them involved first.

    Cheers - Bob

    EDIT: Later the same day:  I moved Giovani's post from the duplicate thread and then deleted the duplicate thread.

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi BAlfson,

    your howto worked fine for me, i had the same problem as described in the initial post. One exception for me though. I have "keep node(s) reserved during Up2Date" enabled (Management -> High Availibility -> Configuration -> Advanced) and had to start the upgrade on the slave node manually. Finally my HA is working again. Thanks!

     

    Cheers,

    Lars

  • Hi Giovani

    With your quick guide I could solve the issue.

    > destroy cluster
    > factory reset slave an bring it to newest firmware version
    > update master to newest firmware version
    > re-create cluster (pay attension on uptime)

    > sync nodes... all good!

    Bob's my uncle now! ;)

    Many thanks for your help.

    Regards
    Marcel