Since a few month we have a unusual problem with our sophos utm which acts as a smtp proxy only.Sometimes we get "zombie emails" which clog up the whole email processing of the smtp proxy.I will try to explain the problem as accurate as possible:
- for example a "zombie email" is arriving at 09:00 am and it is moved to work queue- since this moment - no new delivered or sent emails show up in smtp log in mail manager- at the mail manager "overview page" the counter of emails which "wait to be delivered (spooled)" increases- in the smtp.log via the cli emails arrive and only "moved to work queue"- the directory "/var/chroot-smtp/spool/work/" fills up with files
At this point no email is delivered neither inbound nor outbound. Only moved to work queue. The CPU fires up to 90%-100% usage.
After a random time span, for example 15 or 30 minutes, an email appears in mail manager unter "smtp-spool" with status "error". Now all spooled mails are processed and delivered.If you select "retry" as action, the process starts again everything breaks up and nothing is sent like described above. The only way to handle this is downloading the email and send it to the internal recipient via outlook. After this we have to delete this email out of sophos utm.
So we tried to figure out, what the problem with this emails is. So we downloaded the email and opened it. In every case there was a pdf file attached. But not a "normal" pdf. It was this kind of pdf file, which wants to be saved when closing it (like a pdf formular). In the most cases it was a pdf created by one of these handy-dandy smartphone pdf scan apps.It made no difference to switch between dual-scan / single scan and between avira and sophos av engine.I think make an exclusion for pdf attachments is not a good way either.We had firmware 9.704 and updated to 9.705 but it did not solve the problem.
Have anybody had this problem too? I tried to figure it out with sophos support, but this was a nightmare. The case got closed because i surrendered.
Sorry for my unusual expressions but i am struggeling very hard with this problem and with sophos support.Maybe anyone can help me with that!
For one of the zombie emails, copy here the related lines from the SMTP log file including the connection from and moved to work queue. Also, the relevant lines when the system errors out. Do I understand correctly that all other mails are delivered after the zombie email errors out?
Cheers - Bob
this are the corresponding lines for a zombie email. The email arrives at utm smtp proxy, checks are made and then moved to work queue. Then the scanner fails and fails and then the email is getting in error state. Between the point the mail is moved to work queue and it gets in error state, any incoming and outgoing mails are moved to work queue too, but are not sent out. After the email gets in error state, it needs a few seconds or minutes then all the work queued mails are processed and delivered.
2021:05:10-12:26:17 mailgw-2 exim-in: 2021-05-10 12:26:17 [18.104.22.168] F=<email@example.com> R=<firstname.lastname@example.org> Verifying recipient address in Active Directory2021:05:10-12:26:27 mailgw-2 exim-in: 2021-05-10 12:26:27 1lg37B-000FWV-1o <= email@example.com H=postin23.emailrelay.de [22.214.171.124]:46570 P=esmtp S=45088919 firstname.lastname@example.org:05:10-12:26:29 mailgw-2 smtpd: QMGR: 1lg37B-000FWV-1o moved to work queue2021:05:10-12:26:30 mailgw-2 smtpd: SCANNER: 1lg37O-000FRe-2R <= email@example.com R=1lg37B-000FWV-1o P=INPUT S=450842912021:05:10-12:32:00 mailgw-2 smtpd: MASTER: 1lg37B-000FWV-1o Scanner timeout or deadlock2021:05:10-12:33:01 mailgw-2 smtpd: SCANNER: 1lg3Dh-000Fju-H9 <= firstname.lastname@example.org R=1lg37B-000FWV-1o P=INPUT S=450842912021:05:10-12:39:00 mailgw-2 smtpd: MASTER: 1lg37B-000FWV-1o Scanner timeout or deadlock2021:05:10-12:40:00 mailgw-2 smtpd: SCANNER: 1lg3KS-000G75-Qu <= email@example.com R=1lg37B-000FWV-1o P=INPUT S=450842912021:05:10-12:46:00 mailgw-2 smtpd: MASTER: 1lg37B-000FWV-1o Scanner timeout or deadlock2021:05:10-12:49:10 mailgw-2 smtpd: MASTER: Action: Deleting mail in error state: 1lg37
Stefan, there should be more lines above the first one you showed. Show us the other lines beginning with the one containing 'connection from'.
I've not seen this problem before - is it always the same sender(s)?
these are the other lines corresponding to this mail:
2021:05:10-12:26:17 mailgw-2 exim-in: 2021-05-10 12:26:17 SMTP connection from [126.96.36.199]:46570 (TCP/IP connection count = 1)2021:05:10-12:26:17 mailgw-2 exim-in: 2021-05-10 12:26:17 H=postin23.emailrelay.de [188.8.131.52]:46570 Warning: Exception matched: Skipping antispam for this message2021:05:10-12:26:17 mailgw-2 exim-in: 2021-05-10 12:26:17 H=postin23.emailrelay.de [184.108.40.206]:46570 Warning: recipientsemail.de profile excludes greylisting: Skipping greylisting for this message
(IP Addresses and Hostnames are edited!)
The sender is not the problem. The problem seems to be the attached pdf file (which wants to be saved by closing it in pdf reader).
It seems that the support have not seen this problem either and is not able to help me.
Please show a picture of the Edit of the Exception.
Nothing special. We skip these checks because our provider does this checks and relay the emails to our smtp proxy afterwards.
"Scanner timeout or deadlock" - What happens if you also select 'Schadsoftware-Prüfung' and/or Antispam-Überprufung'?
I think this would be a temporary solution but then I dont have av scanning anymore. The problem exists.
Stefan, the Exception only applies to the Netzwerke listed.
Yes but in the exception, these are all of our "upstream" servers from our security provider which relay any incoming email to our system. It would have the effect, that every incoming email would bypass the av scan.
as a workaround just make an extra exception just for this sender email-address.
BERGMANN engineering & consulting GmbH, Wien/Austria