This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

HA stuck in Syncing

Hi,

since the last server-crash (our utm is a virtual machine) we got this error messages in ha-log and the state is "Syncing" since three days

repctl[4062]: [e] db_connect(2058): error while connecting to database(DBI:Pg:dbname=repmgr;host=198.19.250.2): could not connect to server: Connection refused
repctl[4062]: [e] master_connection(1904): could not connect to server: Connection refused
repctl[4062]: [e] db_connect(2058): error while connecting to database(DBI:Pg:dbname=repmgr;host=198.19.250.2): could not connect to server: Connection refused
repctl[4062]: [e] master_connection(1904): could not connect to server: Connection refused
repctl[4062]: [e] db_connect(2058): error while connecting to database(DBI:Pg:dbname=repmgr;host=198.19.250.2): could not connect to server: Connection refused
repctl[4062]: [e] master_connection(1904): could not connect to server: Connection refused
repctl[4062]: [e] db_connect(2058): error while connecting to database(DBI:Pg:dbname=repmgr;host=198.19.250.2): could not connect to server: Connection refused
repctl[4062]: [e] master_connection(1904): could not connect to server: Connection refused
repctl[4062]: [i] execute(1627): pg_ctl: could not send stop signal (PID: 6256): No such process
repctl[4062]: [i] recover_master(2296): Using previous master 198.19.250.1 for recovery
repctl[4062]: [i] recover_master(2329): Testing SLAVE/WORKER nodes for rsyncd
repctl[4062]: [c] hasyncmsg(1468): this is a primary node
repctl[4062]: [i] recover_master(2402): MASTER: syncing folder /global/pg_control from 198.19.250.1
repctl[4062]: [i] execute(1627): rsync: failed to connect to 198.19.250.1: Connection refused (111)
repctl[4062]: [c] recover_master(2419): rsync failed on $VAR1 = {
repctl[4062]: [c] recover_master(2428): sync aborted
is there a way for me to fix it without reinstalling the firewall?
kind regards


This thread was automatically locked due to age.
Parents
  • Rudolf, have you tried rebooting the Slave?  If that doesn't work, try disabling HA, thereby forcing the Slave to do a Factory Reset, and then re-establish Hot-Standby.  Any luck with that?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Reply
  • Rudolf, have you tried rebooting the Slave?  If that doesn't work, try disabling HA, thereby forcing the Slave to do a Factory Reset, and then re-establish Hot-Standby.  Any luck with that?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Children
  • Hi Bob,

     

    I know this is old, but maybe you can please clarify. Is the Slave you mention the same as the Swarm instance? Is the HA disabling the ha_aws process? I believe I am not using ha, but somehow have the same issue.

  • Hi Efren and welcome to the UTM Community!

    There are too many unknowns for me to make any suggestions.  I recommend you get a case open with Sophos Support.  When you get your issue resolved, please come back here and describe the solution.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA