RegEx URL: Exactly what implementation of regular expressions is used on Sophos XG (SFOS15)?

Question

Is it Posix, Extended RegExp, Perl, ECMAscript or other? I have had a hard time finding the correct syntax for HTTP bypass rules. It does not appear clear from documentation... 
 It would also be very nice, to have a RegEx tester built in, to check if you syntax actually matches what you want - and not by mistake maches every URL! (Is there somewhere in the logs to check this?) 
 - Martin 
 EDIT: And what is the sane explanation, that it is not possible to use RegEx bypass rules for HTTPS scanning?!? This does not make any sense...

sachingurung · Accepted Answer

Hi Martin, 
 Here's an update- 
 We have different check for RegEx at multiple location. The RegEx should be Perl and Java compatible and Max no of URL in Exception list should be < 128 and length of URL is < 100. 
 HTTP Proxy: 
 The proxy compiles the RegExes in the UI using pcre_compile which is &ldquo;Perl-compatible regular expressions&rdquo; 
 API: 
 URL RegExs can&rsquo;t start with ^https:// or ^http:// 
 RegExes are not automatically anchored and must be if desired (example: ^microsoft\.com/ will match http://microsoft.com/ but not http://www.microsoft.com/ . If anchor is missing like: microsoft\.com/ then both http://microsoft.com and http://www.microsoft.com will match) 
 The max length of URLRegEx is 100, this is restricted by DB schema 
 URL RegEx is validated by Perl compiler 
 UI: 
 URL RegExs can&rsquo;t start with ^https:// or ^http:// (there's a bug though, see NC-11547 ) 
 The max length of urlregex is 100 
 We use the Java Script library RegExp to validate the syntax of the Regexes 1. Check # of groups (e.g. if \2 is used, there must be at least 2 groups) 2. Check [] content (e.g. [] should not be allowed because it's empty) 
 total # of URL RegExes in an exception < 128 
 Hope that helps :)

Michael Dunn · Answer

Please be aware of this KB 
 https://community.sophos.com/kb/en-us/127270 
 Summary: In URL Groups and in Categories there is no RegEx, the KB describes what substring matching is done. 
 
 In XG Web \ Exceptions we do not automatically anchor on left side. This gives more flexibility to admins. Yes that includes flexibility to be inefficient. We will not be change this because it would affect existing customers. 
 
 In both XG and in the UTM (and I would argue any computer system anywhere) - when you create a new object you should copy the existing out of box objects as much as possible. 
 In the XG one of the OOB exception is: 
 ^([A-Za-z0-9.-]*\.)?apple\.com\.?/ 
 So if you want to create a new exception you match that style.

Michael Dunn · Answer

https://community.sophos.com/kb/en-us/127270 
 Custom Category with Keyword. 
 
 You may also want to look up Content Filters. This will log/block whenever a webpage contains certain keywords. Typically used by schools to monitor for potentially cyberbullying.

lferrara · Answer

Josimar, 
 it seems that for the moment, XG will not implement regex for blocking web traffic: 
 https://community.sophos.com/products/xg-firewall/f/firewall-and-policies/118423/how-do-i-block-using-regex-in-xg/429219#429219 
 https://ideas.sophos.com/forums/330219-xg-firewall/suggestions/39728035-allow-blocking-of-website-using-regex-to-allow-for 
 Please vote the feature request. 
 Regards