This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

regular expression rule vs forbidden file extension

We use Web Protection with Web Filtering. To allow service downloads for a particular software, a (regular) filter exception has been created.

^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9._-]*\.exe$

Associated to this exception we skipping these checks: Authentication / Caching / Block by download size / Antivirus / Extension blocking / Content Removal / SSL scanning


Downloading the file    

    www.schueco-service.de/.../SP05-2017R2_x64.exe

works very well, while downloading the file

    www.schueco-service.de/.../SchueCam_Setup_1_V20.17.04.07 PBX.exe

will be blocked.

The request will be blocked with status:

httpproxy[16056]: id="0064" severity="info" sys="SecureWeb" sub="http" name="web request blocked, forbidden file extension detected" action="block" method="GET" user="" group="" ad_domain="" statuscode="403" cached="0" profile="REF_DefaultHTTPProfile (Default Web Filter Profile)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="2602" request="0xdd126400" url="www.schueco-service.de/.../SchueCam_Setup_1_V20.17.04.07 PBX.exe" referer="" error="" authtime="0" dnstime="0" cattime="332" avscantime="0" fullreqtime="2138782" device="0" auth="0" ua="Visual Patch 3.5" exceptions="" category="105" reputation="neutral" categoryname="Business" reason="extension" extension="exe" filename="SchueCam_Setup_1_V20.17.04.07 PBX.exe"


But for some reason that exception gets ignored (maybe a problem with the whitespace character ?).

Any ideas and suggestions are welcome.

Many thanks in advance.



This thread was automatically locked due to age.
Parents
  • The space is percent-encoded to %20 and your regex does not allow %.

     

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9._-]*\.exe$

    try

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9._-%]*\.exe$

     

  • Hi Michael,

    the log shows a space not the encoded space. I think that he should use the space as I have suggested.

    Give it a try ;)

    Regards mod

  • Hi mod,

    If you actually test it you would find that the UI does not allow spaces.

    Actually I found the UI does not like

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9._-%]*\.exe$

    but it is ok with this,

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[%A-Z-a-z0-9._-]*\.exe$

     

    And that it works when downloading that file over a browser, which converts spaces to % encoding.  Now his user agent ua="Visual Patch 3.5"  shows he is not using a browser.  So it is possible that it won't work, but I doubt it.  A literal space is against HTTP spec and the UTM will print an error if it sees one.  Worst case is to just use a full wildcard.

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/.*\.exe$

     

Reply
  • Hi mod,

    If you actually test it you would find that the UI does not allow spaces.

    Actually I found the UI does not like

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9._-%]*\.exe$

    but it is ok with this,

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[%A-Z-a-z0-9._-]*\.exe$

     

    And that it works when downloading that file over a browser, which converts spaces to % encoding.  Now his user agent ua="Visual Patch 3.5"  shows he is not using a browser.  So it is possible that it won't work, but I doubt it.  A literal space is against HTTP spec and the UTM will print an error if it sees one.  Worst case is to just use a full wildcard.

    ^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/.*\.exe$

     

Children
  • Hi Michael,

    I've tested "^https?://([A-Za-z0-9.-]*\.)?www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9 ._-]*\.exe$" without any problem. The GUI accepts my exception.

    Regards mod

  • You are correct.  And I learned something new about RegEx.  :)

    In a character class like this:

    [A-Z-a-z0-9._-]

    If a hyphen is the first or last character then it is a literal hyphen.  Otherwise it is a range (such as a-z).

    Therefore these (with a space or percent after the hyphen) are illegal because it is an invalid range

    [A-Z-a-z0-9._- ]

    [A-Z-a-z0-9._-%]

    But if you put the space or percent anywhere else (such as after the 9 as you did) then it should work.

    To play it safe, they can do both.  And while you are at it, there is no need to support additional hostnames before the www.  Therefore

    ^https?://www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9 %._-]*\.exe$

     

    https://www.regular-expressions.info/charclass.html

    "the caret ^ and the hyphen - can be included by escaping them with a backslash, or by placing them in a position where they do not take on their special meaning.
    [...]
    The hyphen can be included right after the opening bracket, or right before the closing bracket, or right after the negating caret. Both [-x] and [x-] match an x or a hyphen. [^-x] and [^x-] match any character that is not an x or a hyphen. This works in all flavors discussed in this tutorial. Hyphens at other positions in character classes where they can't form a range may be interpreted as literals or as errors. Regex flavors are quite inconsistent about this."

  • Hi Michael,

    the GUI accepted the regex but my exception don't work. The following Exception is working, thanks for your hint ;)

    https?://www\.schueco-service\.de/servicepack/GDF/[A-Z-a-z0-9%._-]*\.exe$

    Tested with the original file download :)

    Regards mod

  • Mod, I think you meant:

    https?://www\.schueco-service\.de/servicepack/GDF/[A-Za-z0-9%._-]*\.exe$

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob,

    you are Right, this was a typing mistake ;)

    Regards mod