HA logs

Trying to find out how often our HA is used. It's not easy....

Now you would think it would be easy to find the last ten times HA swung into action (and maybe there is a way) but I'm not familiar with it.
What would you search the logs for eg "is now master" or something similar?

On the HA status page, why can't they just have a simple grid underneath that shows the last 10 fail-overs, each with a link to the salient log file from there

 

Doing this because our logs look like below. Is this normal behavior and master/slave just reporting everything is fine?

2018:07:05-00:02:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:04:35 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:07:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:08:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  841 45.485" name="Executing (wait) /usr/local/bin/confd-setha mode slave"
2018:07:05-00:08:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  842 45.626" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: calling check
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: check: waiting for last ha_mode done
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: check_ha() role=SLAVE, status=ACTIVE
2018:07:05-00:08:45 gw01-2 ha_mode[20052]: check done (started at 00:08:45)
2018:07:05-00:09:35 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:12:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:14:35 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:17:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:19:36 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:22:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:24:36 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:27:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:27:51 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  309 51.905" name="Executing (wait) /usr/local/bin/confd-setha mode master master_ip 198.19.250.1 slave_ip 198.19.250.2"
2018:07:05-00:27:52 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  310 52.120" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: calling check
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: check: waiting for last ha_mode done
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: check_ha() role=MASTER, status=ACTIVE
2018:07:05-00:27:52 gw01-1 ha_mode[26530]: check done (started at 00:27:52)
2018:07:05-00:29:36 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:32:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:34:36 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:37:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:39:37 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:42:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:44:37 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:47:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:49:37 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:52:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:54:37 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:57:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-00:59:38 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:02:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:03:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  843 45.485" name="Executing (wait) /usr/local/bin/confd-setha mode slave"
2018:07:05-01:03:45 gw01-2 ha_daemon[4302]: id="38A0" severity="info" sys="System" sub="ha" seq="S:  844 45.626" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: calling check
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: check: waiting for last ha_mode done
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: check_ha() role=SLAVE, status=ACTIVE
2018:07:05-01:03:45 gw01-2 ha_mode[25393]: check done (started at 01:03:45)
2018:07:05-01:04:38 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:07:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:09:38 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:12:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:14:38 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:17:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:17:51 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  311 51.905" name="Executing (wait) /usr/local/bin/confd-setha mode master master_ip 198.19.250.1 slave_ip 198.19.250.2"
2018:07:05-01:17:52 gw01-1 ha_daemon[4352]: id="38A0" severity="info" sys="System" sub="ha" seq="M:  312 52.068" name="Executing (nowait) /etc/init.d/ha_mode check"
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: calling check
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: check: waiting for last ha_mode done
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: check_ha() role=MASTER, status=ACTIVE
2018:07:05-01:17:52 gw01-1 ha_mode[3677]: check done (started at 01:17:52)
2018:07:05-01:19:39 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:22:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:24:39 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:27:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:29:39 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:32:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:34:39 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:37:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:39:40 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:42:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:44:40 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:47:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:49:40 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:52:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:54:40 gw01-1 repctl[11242]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1
2018:07:05-01:57:45 gw01-2 repctl[7541]: Idea recheck(1057): got ALRM: replication recheck triggered Setup_replication_done = 1


  • Louis, try:

    zgrep 'INFO\-080' /var/log/notifier/2018/*/*|tail

    Cheers - Bob

  • In reply to BAlfson:

    Interesting topic, I was looking for info on how to check the status of the HAHi, what does this command tell us?

    In my case: /var/log/notifier/2018/12/notifier-2018-12-01.log.gz:2018:12:01-05:56:41 hostnamesophos notifier[15456]: processing notification request for INFO-080

     

    Thanks.

  • In reply to papali:

    The INFO-080 notification occurs when there's a change of the Master in HA.

    Cheers - Bob