Help us enhance your Sophos Community experience. Share your thoughts in our Sophos Community survey.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[9.304-9] HA selfcheck warning

Hi Folks,

since upgrading to 9.3 yesterday evening, I get a warning mail nearly every hour.
There is a txt file attached telling:
"HA SELFMON WARN: Restarting repctl for SLAVE"

The System is a HA-Cluster of two ASG320 Systems.
Does anybody have the same issue, or any knowlege how to get this fixed?

Regards!


This thread was automatically locked due to age.
Parents
  • Same here, after upgrading the Software UTM in an active/passive HA Cluster from 9.2x to 9.307-6, rebooting the slave does not solve the problem. Setting the preferred master = none is not an option because the slave node is on weaker hardware. This cluster worked fine for 2 years...

    HA-Log: (System-reboot at 8:05, skipping first dozen lines of successful cluster initialisation)

    2015:02:10-08:15:07 berlin-2 ha_daemon[3968]: id="38A0" severity="info" sys="System" sub="ha" seq="S: 36 07.166" name="Initial synchronization finished!"
    
    2015:02:10-08:15:07 berlin-2 ha_daemon[3968]: id="38A0" severity="info" sys="System" sub="ha" seq="S: 37 07.166" name="state change SYNCING(2) -> ACTIVE(0)"
    2015:02:10-08:15:07 berlin-1 ha_daemon[4551]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 77 07.667" name="Node 2 changed state: SYNCING(2) -> ACTIVE(0)"
    2015:02:10-09:05:07 berlin-2 repctl[4018]:  execute(2275): waiting for server to shut down...
    2015:02:10-09:05:07 berlin-2 repctl[4018]:  execute(2275): .
    2015:02:10-09:05:08 berlin-2 repctl[4018]:  execute(2275): done
    2015:02:10-09:05:08 berlin-2 repctl[4018]:  execute(2275): waiting for server to start....
    2015:02:10-09:05:09 berlin-2 repctl[4018]:  execute(2275): done
    2015:02:10-09:05:09 berlin-2 repctl[4018]: [c] sql_execute(2341): SQL execute: fetch failed
    2015:02:10-09:05:12 berlin-2 ha_mode[9075]: daemonized...
    2015:02:10-09:05:12 berlin-2 repctl[9109]:  execute(2275): pg_ctl: server is running (PID: 9093)
    2015:02:10-09:05:12 berlin-2 ha_daemon[3968]: id="38A1" severity="info" sys="System" sub="ha" seq="S: 38 12.891" name="control_sync(): we are not in state SYNCING, ignoring sync for database/1"
    2015:02:10-09:05:12 berlin-2 repctl[9109]:  execute(2275): waiting for server to shut down....
    2015:02:10-09:05:12 berlin-2 ha_mode[9075]: HA SELFMON WARN: Restarting repctl for SLAVE
    2015:02:10-09:05:13 berlin-2 repctl[9109]:  execute(2275): done
    2015:02:10-09:05:15 berlin-2 repctl[9109]:  start_backup_mode(927): starting backup mode at 00000001000001B800000095
    2015:02:10-09:05:15 berlin-2 ha_daemon[3968]: id="38A1" severity="info" sys="System" sub="ha" seq="S: 39 15.386" name="control_sync(): we are not in state SYNCING, ignoring sync for database/1"
    2015:02:10-09:05:57 berlin-2 repctl[9109]:  stop_backup_mode(948): stopped backup mode at 00000001000001B800000095
    2015:02:10-09:05:57 berlin-2 repctl[9109]:  execute(2275): waiting for server to start...
    2015:02:10-09:05:57 berlin-2 repctl[9109]:  execute(2275): .
    2015:02:10-09:05:58 berlin-2 repctl[9109]:  execute(2275): done
    2015:02:10-09:05:58 berlin-2 ha_daemon[3968]: id="38A0" severity="info" sys="System" sub="ha" seq="S: 40 58.709" name="Deactivating sync process for database on node 1"
    2015:02:10-09:05:58 berlin-2 repctl[9109]:  start_monitor(1548): monitor started (pid 9202)
    2015:02:10-09:05:58 berlin-2 repctld[9202]:  start_monitor(1578): starting repctld (pid 9202)
    2015:02:10-09:05:58 berlin-2 repctl[9109]:  setup_replication(233): checkinterval 300
    2015:02:10-09:20:07 berlin-1 repctl[6933]: [c] sql_execute(2341): SQL execute: fetch failed
    [/CODE]
Reply
  • Same here, after upgrading the Software UTM in an active/passive HA Cluster from 9.2x to 9.307-6, rebooting the slave does not solve the problem. Setting the preferred master = none is not an option because the slave node is on weaker hardware. This cluster worked fine for 2 years...

    HA-Log: (System-reboot at 8:05, skipping first dozen lines of successful cluster initialisation)

    2015:02:10-08:15:07 berlin-2 ha_daemon[3968]: id="38A0" severity="info" sys="System" sub="ha" seq="S: 36 07.166" name="Initial synchronization finished!"
    
    2015:02:10-08:15:07 berlin-2 ha_daemon[3968]: id="38A0" severity="info" sys="System" sub="ha" seq="S: 37 07.166" name="state change SYNCING(2) -> ACTIVE(0)"
    2015:02:10-08:15:07 berlin-1 ha_daemon[4551]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 77 07.667" name="Node 2 changed state: SYNCING(2) -> ACTIVE(0)"
    2015:02:10-09:05:07 berlin-2 repctl[4018]:  execute(2275): waiting for server to shut down...
    2015:02:10-09:05:07 berlin-2 repctl[4018]:  execute(2275): .
    2015:02:10-09:05:08 berlin-2 repctl[4018]:  execute(2275): done
    2015:02:10-09:05:08 berlin-2 repctl[4018]:  execute(2275): waiting for server to start....
    2015:02:10-09:05:09 berlin-2 repctl[4018]:  execute(2275): done
    2015:02:10-09:05:09 berlin-2 repctl[4018]: [c] sql_execute(2341): SQL execute: fetch failed
    2015:02:10-09:05:12 berlin-2 ha_mode[9075]: daemonized...
    2015:02:10-09:05:12 berlin-2 repctl[9109]:  execute(2275): pg_ctl: server is running (PID: 9093)
    2015:02:10-09:05:12 berlin-2 ha_daemon[3968]: id="38A1" severity="info" sys="System" sub="ha" seq="S: 38 12.891" name="control_sync(): we are not in state SYNCING, ignoring sync for database/1"
    2015:02:10-09:05:12 berlin-2 repctl[9109]:  execute(2275): waiting for server to shut down....
    2015:02:10-09:05:12 berlin-2 ha_mode[9075]: HA SELFMON WARN: Restarting repctl for SLAVE
    2015:02:10-09:05:13 berlin-2 repctl[9109]:  execute(2275): done
    2015:02:10-09:05:15 berlin-2 repctl[9109]:  start_backup_mode(927): starting backup mode at 00000001000001B800000095
    2015:02:10-09:05:15 berlin-2 ha_daemon[3968]: id="38A1" severity="info" sys="System" sub="ha" seq="S: 39 15.386" name="control_sync(): we are not in state SYNCING, ignoring sync for database/1"
    2015:02:10-09:05:57 berlin-2 repctl[9109]:  stop_backup_mode(948): stopped backup mode at 00000001000001B800000095
    2015:02:10-09:05:57 berlin-2 repctl[9109]:  execute(2275): waiting for server to start...
    2015:02:10-09:05:57 berlin-2 repctl[9109]:  execute(2275): .
    2015:02:10-09:05:58 berlin-2 repctl[9109]:  execute(2275): done
    2015:02:10-09:05:58 berlin-2 ha_daemon[3968]: id="38A0" severity="info" sys="System" sub="ha" seq="S: 40 58.709" name="Deactivating sync process for database on node 1"
    2015:02:10-09:05:58 berlin-2 repctl[9109]:  start_monitor(1548): monitor started (pid 9202)
    2015:02:10-09:05:58 berlin-2 repctld[9202]:  start_monitor(1578): starting repctld (pid 9202)
    2015:02:10-09:05:58 berlin-2 repctl[9109]:  setup_replication(233): checkinterval 300
    2015:02:10-09:20:07 berlin-1 repctl[6933]: [c] sql_execute(2341): SQL execute: fetch failed
    [/CODE]
Children
No Data