This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

HA logs

Trying to find out how often our HA is used. It's not easy....

Now you would think it would be easy to find the last ten times HA swung into action (and maybe there is a way) but I'm not familiar with it.
What would you search the logs for eg "is now master" or something similar?

On the HA status page, why can't they just have a simple grid underneath that shows the last 10 fail-overs, each with a link to the salient log file from there

 

Doing this because our logs look like below. Is this normal behavior and master/slave just reporting everything is fine?

2018:07:05-00:02:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:04:35 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:07:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:08:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  841 45.485" name="Executing (wait) /usr/local/bin/confd-setha mode slave"
2018:07:05-00:08:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  842 45.626" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: calling check
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: check: waiting for last ha_mode done
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: check_ha() role=SLAVE, status=ACTIVE
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: check done (started at 00:08:45)
2018:07:05-00:09:35 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:12:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:14:35 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:17:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:19:36 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:22:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:24:36 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:27:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:27:51 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  309 51.905" name="Executing (wait) /usr/local/bin/confd-setha mode master master_ip 198.19.250.1 slave_ip 198.19.250.2"
2018:07:05-00:27:52 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  310 52.120" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: calling check
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: check: waiting for last ha_mode done
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: check_ha() role=MASTER, status=ACTIVE
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: check done (started at 00:27:52)
2018:07:05-00:29:36 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:32:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:34:36 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:37:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:39:37 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:42:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:44:37 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:47:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:49:37 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:52:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:54:37 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:57:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:59:38 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:02:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:03:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  843 45.485" name="Executing (wait) /usr/local/bin/confd-setha mode slave"
2018:07:05-01:03:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  844 45.626" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: calling check
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: check: waiting for last ha_mode done
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: check_ha() role=SLAVE, status=ACTIVE
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: check done (started at 01:03:45)
2018:07:05-01:04:38 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:07:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:09:38 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:12:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:14:38 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:17:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:17:51 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  311 51.905" name="Executing (wait) /usr/local/bin/confd-setha mode master master_ip 198.19.250.1 slave_ip 198.19.250.2"
2018:07:05-01:17:52 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  312 52.068" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: calling check
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: check: waiting for last ha_mode done
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: check_ha() role=MASTER, status=ACTIVE
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: check done (started at 01:17:52)
2018:07:05-01:19:39 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:22:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:24:39 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:27:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:29:39 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:32:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:34:39 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:37:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:39:40 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:42:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:44:40 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:47:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:49:40 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:52:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:54:40 gw01-1 repctl[11242]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:57:45 gw01-2 repctl[7541]: [i] recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1




This thread was automatically locked due to age.
Parents
  • Louis, try:

    zgrep 'INFO\-080' /var/log/notifier/2018/*/*|tail

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Reply
  • Louis, try:

    zgrep 'INFO\-080' /var/log/notifier/2018/*/*|tail

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Children
  • Interesting topic, I was looking for info on how to check the status of the HAHi, what does this command tell us?

    In my case: /var/log/notifier/2018/12/notifier-2018-12-01.log.gz:2018:12:01-05:56:41 hostnamesophos notifier[15456]: processing notification request for INFO-080

     

    Thanks.

  • The INFO-080 notification occurs when there's a change of the Master in HA.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA