This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Intrusion Prevention (IPS) high cpu usage - Snort

Hello,

in our company we got about 60-80 users. Each department got his own vlan running over one port.

XGS2100 (SFOS 19.0.1 MR-1-Build365)

Over the year i was setting up the sopho xg and adding all Firewall rules, like all department are in one zone and got a any rule to our servers with the specific ports needed. (Each server his own rule) . I think i am about 80-85 rules now for evrything. (Yes i am using zones to group the departments atleast.)

Now most of them got ips and the other stuff activated (AV/WEB/APP/IPS/LOG) but even after using the predefined ips rule (LAN TO WAN and LAN to DMZ) in hope to reduce some stress, the cpu usage is still high.

I readed some does not use all this to lan to dmz? is that good? My thinking is, if user gets infected like over mail, and using thang a exchange vulnerability, i got atleast my sophos with ips and zero day, right?

Usally at in break times, but sometimes also between them. You can always see in the Dignostic Graph a high cpu usage. The whole network get than sloppy and sometimes disconects applications, thats hell for running teams meeting and remote sessions.

Under the command "top" i can see multiple process with snort, which has 99 cpu usage, and i see all cpu most of time at 100.

All pattern should be up to date (intervall set to high). Ips Settings max packages 8 is still there.

Now my questions:

Is our sophos to small? Frankly speaking this got worse some months ago maybe at start of september, but maybe when we got the upgrade from version 18.xx to 19.xx.

Can i optimize the ips profiles?

My problem no matter where i look i cant find a good documentation what all categories means (like misc, scan) sure i can click os-windows, or server and client.

Are "clients" only win 10 ? Or does that mean which direction the attack is happening?

If i look at the lan to wan ips template,firstly you see many entrys like OS-windows, then browser, then windows clients and then all clients?? What does that mean all clients? Why are there entries before for windows/linux if i got entry for all clients? Is there difference or is it for the purpose i use it as template and delete what i dont need?

My biggest problem right now is, that i am missing the tools to investigate further, the reporting would be usefull if you could specifc a time, but no you can always do only days.



This thread was automatically locked due to age.
Parents
  • Snort is not IPS. Snort is the entire DPI Engine. Therefore, if the packet is flowing over the Fastpath, it is also Snort.

    The problem right now is: XGS is a dual Processor appliance. So the view of atop/top could be not the truth. Instead it could be only the X86 CPU. 

    This should be investigated by Support in more detail. 

    __________________________________________________________________________________________________________________

  • Hello, yes i opened the case, they even got senior technican to look through.

    They recommend me to disable all IPS stuf from my lan to lan (Servers) and lan to wan, and only enable it wan to lan.

    We only have one port open right now for our mail server.

    Is this recommended? I always thought IPS should be enabled for any WAN traffic, not just incoming open port to my mail server.

    Instead i should enable the Advance Threat Protection, whats the difference between those two engines? The IPS already sounded likte what i needed, i cant really find anything about the ATP.

    Something else he told me to use in the web exceptoin use \. isntead of . (dot), i will defentiley try this out.

  • Is this recommended?

    No and you should not be statisfied with that answer. Also I would think your firewall is fast enough for your number of users. But of course it depends on specific setup.

    Maybe caused by server backups made over FW rules with IPS enabled and things like this.

    When your machine runs with 100% CPU all the time there is something completely wrong.

    What is shown in diagnostigs > system graphs?

    One example when high traffic is pushed over a XG starting at 16:00:

    from the useless Dashboard Activity monitor:

    and from System Graphs:

    So we can see the backups impacting load from 16:00 on but it does not impact performance.

    The firewall rule that handles the traffic has IPS disabled.

    Check the tls exceptions as mentioned by   wrongly set exceptions it can make things worse.

  • Hello,

    yeah some are quite a bunch of web exceptioen and a bunch inside local tls exlcusion list.

    I started to use the web exception more, since you can name name them there better.

    Most of them are looking like that: ^[A-Za-z0-9.-]*\.sophosxl\.net/

    But thats merely two pages, and i think not much but i wont call it less. Traffic like backup is sheduled at night, and there is ips already off. The only backup job is running evry hour for file about 250 mb, so a very light job. I know how read graphes.

Reply
  • Hello,

    yeah some are quite a bunch of web exceptioen and a bunch inside local tls exlcusion list.

    I started to use the web exception more, since you can name name them there better.

    Most of them are looking like that: ^[A-Za-z0-9.-]*\.sophosxl\.net/

    But thats merely two pages, and i think not much but i wont call it less. Traffic like backup is sheduled at night, and there is ips already off. The only backup job is running evry hour for file about 250 mb, so a very light job. I know how read graphes.

Children
  • We have break at 9:15 and 12:15 you can cleary see thats something happening there, but we also got days it looks like rollercoaster without any low times.

    Like i said i got on most of the rules ips and the web stuff protecion on. On ssl enryption i only enrypt into direction wan. I could unstand if this to much, but really i cant find any confirmation. My next step would be to get bigger hardware to try out if it resolve this.

  • On yearly you can also see there defently a change happening, i mean i updatet there the firmware, and at may april i was mosty done withe the configuration. Maybe added some web exception over time.

  • We had a customer with around 500 web exceptions that were like this:
    [A-Za-z0-9.-]*\.sophosxl\.net/

    And 100% CPU.  We added the starting caret to make them like this
    ^[A-Za-z0-9.-]*\.sophosxl\.net/

    And the CPU went down to 10-20%.  The exact specifics of the Regex matters.

    If you have web exceptions that have this regex
    www.example.com

    It will perform poorly.

  • why will subdomain\.domain\.com\.?/ perform poorly? Is this not less ?

    How should i define then? if only want this specifc subdomain or domain.

  • I am testing it right now, we had about 40 web exceptions where i missed the ^ at the start. So basically what u wrote.

    I hope the rules still works but so far, it looks pretty good!

    Are there any recommendation left? Sometimes i use whole ip adress subnets, or specif domains or subdomains, like:

    example\.com/?

    subexample.domain\.com/?

    8.8.0.0/16

  • If you do:
    example\.com/?

    It will search this url:
    example.com.fakesite.com/fooledyou/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii//aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccccccccccccccccccccccccccccccccc/ddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffffffffffffff/ggggggggggggggggggggggggggggggggggggghhhhhhhhhhhhhhhhhhhhhhhh/iiiiiiiiiiiiiiiiiiiiiiiiii/example.com/jjjjjjjjjjjjjjjjjjjjjj/

    It would start the search at the front of the string and start the matching, first the e then the x.  However at the end the of the .com there would not be a ending / so it gives up on that match.  It then try to match to anywhere else in the 2400 byte url and eventually it would find the example.com/ that exists in the path and apply the exception.

    If you do:
    ^example\.com/?

    Then it limits the search only the beginning of the string.  When there is no / it gives up and stops.  Which is both safer (more accurate) and faster.

    For a bigger example of a performance gain consider this url
    somerandomsite.com/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/1234567890/

    If you do:
    ^example\.com/?

    It does one match attempt - the first character s does not equal e, so it gives up on trying to match anything else.

    If you do:
    example\.com/?

    Then after s != e it goes to the next character and o != e, etc.  So it has to do ~2000 comparisons to determine if the regex matches, instead of 1.  And that is to find a "not a match" which 99.99% of URLs will be.

    Out of curiosity I tried pumping an example into regex101.com.  This only gives performance numbers on matching, not non matching.  But matching a url at the beginning of a 2000 character string took "13 steps, 0.00ms" and next the end took "529 steps, 0.1ms".


    You should also note in this:
    ^[A-Za-z0-9.-]*\.sophosxl\.net/

    The first [A-Za-z0-9.-]* effectively is "any number of any character that is valid in a domain name" but it does NOT include /.  Therefore as soon as the first / is found (indicating path) this wildcard match stops.  If you were to replace it with ^.*\.sophosxl\.net/ you would be in the same boat as before and searching the entire url.

    The above example requires ".sophosxl.net" so it matches subdomains but not the main domain name.  Most people probably want to use this format, which makes the subdomains optional.

    ^([A-Za-z0-9.-]*\.)?teamviewer\.com/?


    If you are more curious about regex performance you can google it.  I bet there are even courses on this.

  • To answer the other part of the question I missed:

    8.8.0.0/16

    If you mean the ip addresses as the hostname in a url:

    8.8.1.2/path/resource

    Google "regex to match ip range".  Note: there are some differences in flavors of regex, the XG may not support a cut&paste from some random website that may be using a different variant.

    If you mean actual IP addresses then don't use regex on the url, the exceptions support destination IP addresses (and include CIDR ranges).

  • Thanks very much, you solved the perfomance issue, sophos should add quick help there (Grayed out examples)

    if the ^ matches till the first /, would be the ? at end not be useless?

    And at end u wrote ^.*\.sophosxl\.net/ , would be it not be more like ^*\.sophosxl\.net/ 

    If i got right with ^.*\.sophosxl\.net/  it would search after aaaaaaaaaaaaaaaaaaaaaaa.sophosx.l.net and evry other combination, instead cutting of the string at beginngen and match only the end part.

  • When we first saw customers having performance problems with exceptions and tls exclusions I wrote KB-000043654 with the catchy title "Sophos Firewall: Exclude a website from TLS inspection" to explain the performance impacts.  The focus was on how to exclude, not how to diagnose performance issues.  When I have some spare time I am trying to write a new article that will have some of the same information but specifically about (and titled) high cpu in snort.  Right now I am working on collecting several different causes that have come through escalations.  Question: should I include the "why", such as the examples I wrote above?  Usually KB are more about the answer ("write all your regex in this style") then explaining why the answer is correct.

    You are correct about the teamviewer example.  I had just cut and paste it out the XG without thinking about it.  Offhand I do not know why we wrote it like that, It might be that some teamviewer things do weird but valid stuff like server.teamviewer.com:80/foobar.  I would agree that newly written ones should not have an ending ? (which makes the slash optional).  The Microsoft example is a better one.

    Sophos maintains several different servers and subdomains, such as gw.sophosxl.net primary.wing.sophosxl.net.  The RegEx as written will match all subdomains but not the main domain (where we never have a server).

    To parse the regex I used in the example.

    ^.*\.sophosxl

    ^ matches the beginning of string
    .  matches any character
    * means the preceding thing can match zero-to-many times
    \ next character is a literal
    . is a literal period if there was a slash (or any character if no slash)

    So...
    ^ beginning of string
    .* any number of any character (including slashes)
    \. literal period

    ^*\. is what you asked about, which is "any number of beginning of string" which is not what you want.

    The actual RegEx we use is
    ^[A-Za-z0-9.-]*\.sophosxl

    ^ beginning of string
    [A-Za-z0-9.-]* any number of any valid fqdn character (not including slashes)
    \. literal period (not optional)