Filtering non English character emails

Question

Hi All, 
 Recently we are getting inundated with emails containing Russian and Chinese characters. Is there anyway to filter these emails to quarantine via UTM 9 anti spam? 
 Thanks,

DouglasFoster · Answer

You should be able to do this with a regex. 
 I don't normally post non-Sophos links, but I found this resource which explains Unicode and Regex and the syntax differences between different Regex libraries. 
 https://www.regular-expressions.info/refflavors.html 
 I do not know what RegEx library is used on UTM. Mr Alfson or Mr. Jaydeep, can you answer that? 
 Recently I tried this Regex in one of my three spam filters. I think this particular syntax works all regex libraries.. 
 [\p{S}p{C}P{Latin}] 
 That is supposed to say: 
 
 Any symbol (heart, smiley, etc.) 
 Any control character. 
 Any non-Latin script 
 
 We actually disabled the rule because it was matching too much. In retrospect, I think the issue was that I was checking both Body and Subject, and the body has line feeds, so the control character match is not appropriate for the Body check. 
 For your purposes, [P{Latin}] should catch Russian and Chinese text (and Korean and Katakana, etc.)