Questions about the new DPI Engine.

First of all, I'm just a home user, so I feel like I shouldn't be complaining that much in here, or even making this post. ¯\_(ツ)_/¯

---

First Question:

 

On v18 It has introduced the brand new DPI Engine, which as said by , is a:

"Single high-performance streaming DPI engine with proxyless scanning of all traffic for AV, IPS, and web threats as well as providing Application Control and SSL Inspection."

The problem is in  - "Single high-performance streaming DPI engine with proxyless scanning of all traffic for AV", The AV is the problem.

 

(I'm not a professional, so If there's mistakes I'm sorry, also, please tell me if there is.)

While on v17.5, also on v18, If you use the Legacy Web Proxy, the service "avd" would be used as AV.

In my understatement, the new DPI Engine which uses Snort as It service, would also use it for AV, but, but while using the DPI Engine you can see the "avd" service being spawned, and used by XG for AV scanning.

This is not a issue, well I'm not a Sophos Dev, so in my opinion this is just weird, again, It's not a issue.

 

Playing on my Home setup, the main noticeable difference between using Web Proxy and the DPI Engine (While using HTTP(s) scan.) Is throughput, the DPI Engine is much faster than the Web Proxy.

Now the "issue"; It's CPS, both Web Proxy and "Xstream SSL Inspection" can handle the same amount of CPS with the AV, which in my setup is 750~ (Wrong, check Edits.). So the main "issue" here is the AV.

CPS = Connections per Second, here it's just HTTPS.

Edit: I've shouldn't have done this testing at midnight, I has way too sleepy for this, The numbers on the DPI Engine is correct, but on Web Proxy is slower than I wrote before, now It makes more sense.

Edit: Also, good job for the Sophos Devs, the difference between the Legacy Web Proxy and the new DPI Engine is impressive.

Edit 2: Some additional information about this, which makes the difference between Web Proxy and DPI Engine even more impressive, I've used TLS v1.3 with the DPI Engine, while on Web Proxy has TLS v1.2.

Edit 3: Web Proxy has using this Chiper/Auth combination: TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, 256 bit Keys, TLS v1.2;

Edit 3: DPI Engine has using this Chiper/Auth combination: TLS_AES_256_GCM_SHA384, 256 bit Keys, TLS v1.3;

Edit 4: I will try to force TLS v1.2 on the DPI Engine to see if there's even a higher difference.

XG v18 EAP3 Refresh-1 Web Proxy TLS v1.2 DPI Engine TLS v1.2 DPI Engine TLS v1.3
CPS with/AV

275~

980~ 750~
CPS without/AV 430~ 4700~ 4100~
Latency >90%/Connections 0.070/Sec 0.014/Sec 0.101/Sec

Disabling HTTP(s) scanning, and there you go, more than 5x the CPS while using the DPI Engine. It feels like the current XG AV is holding the DPI Engine back, and not letting it to have the performance that It's capable of.

After all this *** writing, my question is, is this expected? Will XG keep using "avd" as the AV for HTTP(s), IMAP, and so on? I'm not saying It's wrong, or bad, again, It's just weird. Also if It is, then why - "Single high-performance streaming DPI engine with proxyless scanning of all traffic for AV", In my understatement of what has said on this, shouldn't Snort be also taking care of AV?

---

Second Question:

 

Why the hell "avd" uses 99.9% of a single-core while scanning .txt files?

This becomes a issue when your running any linux distro, which the package manager downloads .txt files to know if there's any package to upgrade. The CPU usage of a single core goes all the way up to 100%.

A single "pacman -Syu" which at first download 4x .txt files, can take up to 45 seconds, and It's only at limit 6MB of total size, (I'm on a 400/200Mbit/s WAN, and the package manager mirror is capable to push my link to it's limits.)

This doesn't happen with any other kind of file format, hell, .exe scanning feels like it's instant compared to .txt

---

Third Question:

 

Why "avd" is a single-core service?

---

 

That's It.

Again, It feels like all of this is all expected, but I'm just a home user, so I feel like I shouldn't be complaining that much in here.

Also 750~ CPS is more than sufficient for a Home network.

 

Thanks.

Parents
  • I believe that I understand why I will probably never get answers about those questions, It can be 2 things:

    • I'm complete wrong about everything that I wrote. // (I'm almost sure about this one.)
    • Or this is already known.

     

    Anyways, this picture will haunt my dreams tonight.

    Full blown 8C/16T with 12GB DDR4 RAM, limit by avd, while all Snorts services basically idle at 10% usage at each core.

    At least It's fast when It doesn't use "avd". (Nice touch changing to GB/s instead of showing as xxxx MB/s :D)

     

    Also, sorry for whining too much in here, It's just a bit frustrating seeing all this, If there's any need I'll delete this thread.

    Thanks!


    If a post solves your question use the 'Verify Answer' button.

    Ryzen 5600U + I226-V (KVM) v21 GA @ Home

    Sophos ZTNA (KVM) @ Home

  • Quick answers:

    There is one instance of the AV engine running in a process called avd. Web proxy, FTP proxy, mail proxy, and DPI mode all call into the one AV instance, so that we don't need to run four copies on the box. Also remember there is configuration such as Dual scan or Single scan with which AV engine you use (Sophos or Avira). Yes, that is how it will continue to work. There is very little performance impact to whether the AV thread is a part of the snort process or a part of the avd process (or rather, other things are bigger impact).

    When you enable HTTPS scanning the system needs to do a lot of SSL decryption which takes CPU cycles (and lower CPS). In addition, it means that files will be AV scanned, which takes CPU (and lowers CPS). However I suspect that the decryption part is the bigger factor. Do not enable/disable HTTPS scanning and then claim you are measuring with and without AV. If you wanted to measure the impact of AV, then leave Decryption on and toggle the "Scan HTTP and decrypted HTTPS" to turn on/off AV. Make sure that you are using a web policy (eg not set to None).

    As for avd using 99% cpu that might be an artifact of "top". Can you give me a real-world example/impact? eg a specific curl for a text file that tool a long time to scan.

  • As an aside,

    For any connection handled by snort, there are several different processes that are called out to. 

     

    Starting at connection (eg the first packet from the client) we are doing checking authentication, which could potentially call out to two different processes.  Note if the client re-uses the connection this cost does not occur again.

    Starting at the request (eg the first packet from the client) we are doing web categorization, a call out to a different process.  This could potentially even call out to make a request to a cloud server.

    At the end of the request (eg the last packet from the server) we are doing AV scanning, a call out to a different process.

    There might be other processes that snort calls out to, but only one per request.

    On the other hand decryption is something that happens on every single packet.

     

    For decryption, if you are downloading 100 1MB files or 1 100MB file I think the CPU cost of decryption is roughly the same.  But for the other costs, the traffic type changes things.

     

  • Michael Dunn said:
    For decryption, if you are downloading 100 1MB files or 1 100MB file I think the CPU cost of decryption is roughly the same.

    That is interesting to know because usually it doesn't work like that and one large data stream is never equal to multiple small data streams taking the same bandwidth.

     Thaks for more testing. I agree that XG v18 is really snappy for regular surfing and they have made great improvements in the surfing performance. DPI can only get better from here and if they can keep the memory footprint and cpu under control, v18 should be a good release.

    Regards

  • Billybob said:

     

    That is interesting to know because usually it doesn't work like that and one large data stream is never equal to multiple small data streams taking the same bandwidth.

    I'm only talking generally and I'm only talking the CPU cost of decrypting traffic.  Yes there is overhead per request.  But at the low level it needs to decrypt 1 million packets it doesn't care how many TCP connections are involved.

    The "overhead per request" such as categorization and AV scanning, the number of requests/data streams matter.  The "overhead per packet" such as SSL decryption the number of requests/streams does not.

     

    When using the system as a user or admin, you don't care about any of that.  When you are trying to do performance testing, understanding how the shape of your test traffic affects the test can make a difference.

     

    I am glad that while a lot of people were complaining about performance in EAP1/2/3, now that EAP3-Refresh people are finding the performance good.  My understanding is that right now they are still doing some tuning so that it is good in both high end and low end appliances as some models are seeing better benefits than others.

Reply
  • Billybob said:

     

    That is interesting to know because usually it doesn't work like that and one large data stream is never equal to multiple small data streams taking the same bandwidth.

    I'm only talking generally and I'm only talking the CPU cost of decrypting traffic.  Yes there is overhead per request.  But at the low level it needs to decrypt 1 million packets it doesn't care how many TCP connections are involved.

    The "overhead per request" such as categorization and AV scanning, the number of requests/data streams matter.  The "overhead per packet" such as SSL decryption the number of requests/streams does not.

     

    When using the system as a user or admin, you don't care about any of that.  When you are trying to do performance testing, understanding how the shape of your test traffic affects the test can make a difference.

     

    I am glad that while a lot of people were complaining about performance in EAP1/2/3, now that EAP3-Refresh people are finding the performance good.  My understanding is that right now they are still doing some tuning so that it is good in both high end and low end appliances as some models are seeing better benefits than others.

Children
No Data