I want to be able to use a regular expression to block spam we get. The problem I have is that the spammer is using a UTF-8 base 64 encoding method in the Subject area and From area. The emails seem to slip by because the subject and from is utf-8.
Questions
1. Does Regular Express see the email headers before or after the translation of UTF-8 Encoding?
2.Can I make a Regular Express to filter subject and From lines in the Header to filter out encoded in UTF-8?
The UTF-8 decoded in Subject is "RE: Your mailbox is running out of data storage kindly update your mailbox to avoid email loss"
and From is "Webmaster Support"
3. has anyone have a defense to this?
Example below. (blocked my address with domain.com).
Received: from gateway.domain.com (10.0.0.1) byserver.domain.com (10.0.0.60) with Microsoft SMTP Server (TLS) id 14.3.498.0; Fri, 14 May 2021 05:43:12 -0700Received: from ns31246108.ip-151-106-32.eu ([151.106.32.106]:53768) by gateway.domain.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <dtc@digitstradingph.secureserver.net>) id 1lhX9p-0001UO-0l forme@domaincom; Fri, 14 May 2021 05:43:09 -0700Received: from dtc by digitstradingph.secureserver.net with local (Exim 4.94.2) (envelope-from <dtc@digitstradingph.secureserver.net>) id 1lhX9n-0002cQ-At for me@domain.com; Fri, 14 May 2021 20:43:07 +0800X-CTCH-RefID: str=0001.0A702F1B.609E705D.0005,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0To: <me@domain.com>Subject: =?UTF-8?B?WW91ciBtYWlsYm94IGlzIHJ1bm5pbmcgb3V0IG9mIGRhdGEgc3RvcmFnZSBraW5kbHkgdXBkYXRlIHlvdXIgbWFpbGJveCB0byBhdm9pZCBlbWFpbCBsb3Nz?=X-PHP-Script: aimfs.digitstrading.ph/database/alexusMailer_v2.0.php for 102.89.1.144X-PHP-Originating-Script: 1006:alexusMailer_v2.0.phpFrom: =?UTF-8?B?V2VibWFzdGVyIFN1cHBvcnQ=?= <kumaranayagam@woolimlanka.com>MIME-Version: 1.0;Content-Type: multipart/mixed; boundary="--w3TfW1QSbT"Message-ID: <E1lhX9n-0002cQ-At@digitstradingph.secureserver.net>Date: Fri, 14 May 2021 20:43:07 +0800X-AntiAbuse: This header was added to track abuse, please include it with any abuse reportX-AntiAbuse: Primary Hostname - digitstradingph.secureserver.netX-AntiAbuse: Original Domain - domain.comX-AntiAbuse: Originator/Caller UID/GID - [1006 993] / [47 12]X-AntiAbuse: Sender Address Domain - digitstradingph.secureserver.netX-Get-Message-Sender-Via: digitstradingph.secureserver.net: authenticated_id: dtc/only user confirmed/virtual account not confirmedX-Authenticated-Sender: digitstradingph.secureserver.net: dtcX-Source:X-Source-Args: php-fpm: pool aimfs_digitstrading_ph X-Source-Dir: digitstrading.ph:/public_html/aimfs/databaseReturn-Path: dtc@digitstradingph.secureserver.netX-MS-Exchange-Organization-AuthSource: SERVER1.domain.localX-MS-Exchange-Organization-AuthAs: AnonymousX-MS-Exchange-Organization-Antispam-Report: IPOnAllowListX-MS-Exchange-Organization-SCL: -1X-Auto-Response-Suppress: DR, OOF, AutoReply
Hi Robert and welcome to the UTM Community!
I don't recall seeing this issue here before. The REGEX for ?UTF-8?B? is \?UTF-8\?B\? (the "\" is the escape character to tell the REGEX evaluator to look for the following character) - have you tried that? Then again, without seeing the SMTP log to see what it sees, you might simply be able to use mailbox is running out of data storage.
Cheers - Bob
Hello BAifson,
Thank you for the reply. I have taken a look at the smtp proxy log. The subject line is decoded and present. But I could not find the "from header", that must be the body of the email? So you are correct the REGEX is decoded in the smtp. I will try the text, "mailbox is running out of data storage". I think what is happening the spammer is changing the text slightly, that why is still getting through.