This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Spam release using cluster (ASG320 V9.106)

We are having a very strange issue with releasing spam VIA email. The web proxy is blocking the request even with the site being added to exceptions. The system has three nodes. If we shut down two of the nodes it will release fine but when they are running it times out with the following. BTW It can be released from outside the operation with no issues.

2013:10:30-15:58:54 westerville-3 httpproxy[6318]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="192.168.1.112" dstip="192.168.1.233" user="lhoshor" statuscode="504" cached="0" profile="REF_FZoxISIdPn (HTTP)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="2644" request="0xe1e4da0" url="blank.com:3840/release.plc
2013:10:30-15:59:55 westerville-3 httpproxy[6318]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="192.168.1.112" dstip="192.168.1.233" user="lhoshor" statuscode="504" cached="0" profile="REF_FZoxISIdPn (HTTP)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="2553" request="0xe2af7b0" url="blank:3840/favicon.ico" exceptions="" error="Connection to server timed out" category="9998" reputation="neutral" categoryname="Uncategorized"
2013:10:30-16:00:26 westerville-3 httpproxy[6318]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="192.168.1.112" dstip="192.168.1.233" user="lhoshor" statuscode="504" cached="0" profile="REF_FZoxISIdPn (HTTP)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="2644" request="0xe2ca7c0" url="blank:3840/release.plc

Haven't seen this before. Could be a bug.

This thread was automatically locked due to age.

0 BAlfson over 11 years ago
Mike, if there are more than two units in the cluster, I'm not sure how to adjust the following. Hopefully Sophos Support already has fixed this.

To rebuild the SMTP PostgreSQL database:

/var/mdw/scripts/smtp stop
dropdb -U postgres smtp
createdb -U postgres smtp
/var/mdw/scripts/smtp start

EDIT 2016-07-12 - changed line 3 to createdb which is new standard

From the command line, you then need to get to the Slave: ha_utils ssh

And then, rebuild the database there as above.

Please let us know the result and anything different that Support did.

Cheers - Bob
Sophos UTM Community Moderator
Sophos Certified Architect - UTM
Sophos Certified Engineer - XG
Gold Solution Partner since 2005

MediaSoft, Inc. USA
Cancel
Vote Up 0 Vote Down

Cancel
0 ratz over 11 years ago

Actually this is a little easier to do compared to the past. You have to make each unit MASTER one at a time if it is a HA Active/Active Cluster. Then check if the function fails on each unit until you find the units that have bad databases.

As an Example: if the mail postgresql table is toast; the summary stats on the mail manager tab will be blank. Or in this case the spam quarantine table will not release mail.

When you locate the bad unit; AND it is the master; then do a:

sudo /etc/init.d/postgresql92 rebuild

After the rebuild finishes that unit will be working again. Repeat with each unit looking for broken tables. Notice the command has changed with the most recent updates as postgresql was up rev'd. You should verify if you have /etc/init.d/postgresql92 before proceeding; or use the older command.

Sadly the recent bug with "preferred masters" cause many of our client mail databases to get torqued. We have all of ours fixed now. For completeness here is the process we used to fix it. At customer locations.

1) Connect; find the node in the best condition from what ever data we can scrounge from active mode. This is subjective base on experience. We usually go with the one with the best history data.
2) Make the best unit Master by rebooting the others.
3) From the HA Control panel. Set preferred to "none"
4) Wait 30 seconds.
5) From the HA Control panel "Shutdown" all units except the master
6) When each node is "dead" remove it.
7) Make sure "Enable automatic configuration of new devices" is turn on.
8) Boot one "dead" node. Wait for it to join and finish syncing.
9) Repeat (8) each time the system gets to full "ACTIVE" state; until all nodes are running.

At this point your clusting is fixed and you just need to fix units that have torqued databases. Odds are you have at least one with a bad database; we had a minimum of 1 unit in every cluster that failed.

A) Make each unit master in turn and make sure you can view the mail-database and release spam. If you cannot then run the above rebuild command on that unit.
B) Repeat (B) until every unit works as master.

That is it; it is boring; it is slow; but it is easy and it will fix damage caused to the databases by the preferred master bug. I am not with the support team; but at this point only Alan in support has been around longer than me. Since I do not post often one of the regular gurus can verify my solution is a good one. Normal disclaimers applies.
Cancel
Vote Up 0 Vote Down

Cancel
0 BAlfson over 11 years ago

Thanks, Ratz! I didn't realize that the SMTP database issue was related to the Preferred Master problem. Since there's no postgresql command in the prescription above that I learned in V7, can you confirm that it still works in V9 or that the old prescription is completely replaced by the new postgresql92 rebuild command?

Cheers - Bob

Sorry for any short responses. Posted from my iPhone.

Sophos UTM Community Moderator
Sophos Certified Architect - UTM
Sophos Certified Engineer - XG
Gold Solution Partner since 2005

MediaSoft, Inc. USA
Cancel
Vote Up 0 Vote Down

Cancel