Questions about the new DPI Engine.

Question

First of all, I'm just a home user, so I feel like I shouldn't be complaining that much in here, or even making this post. &macr;\_(ツ)_/&macr; 
 --- 
 First Question: 
 
 On v18 It has introduced the brand new DPI Engine, which as said by Michael Dunn , is a: 
 "Single high-performance streaming DPI engine with proxyless scanning of all traffic for AV, IPS, and web threats as well as providing Application Control and SSL Inspection." 
 The problem is in - " Single high-performance streaming DPI engine with proxyless scanning of all traffic for AV " , The AV is the problem. 
 
 (I'm not a professional, so If there's mistakes I'm sorry, also, please tell me if there is.) 
 While on v17.5, also on v18, If you use the Legacy Web Proxy, the service "avd" would be used as AV. 
 In my understatement, the new DPI Engine which uses Snort as It service, would also use it for AV, but, but while using the DPI Engine you can see the "avd" service being spawned, and used by XG for AV scanning. 
 This is not a issue , well I'm not a Sophos Dev, so in my opinion this is just weird, again, It's not a issue . 
 
 Playing on my Home setup, the main noticeable difference between using Web Proxy and the DPI Engine (While using HTTP(s) scan.) Is throughput, the DPI Engine is much faster than the Web Proxy. 
 Now the "issue"; It's CPS, both Web Proxy and "Xstream SSL Inspection" can handle the same amount of CPS with the AV, which in my setup is 750~ (Wrong, check Edits.). So the main "issue" here is the AV. 
 CPS = Connections per Second, here it's just HTTPS. 
 Edit: I've shouldn't have done this testing at midnight, I has way too sleepy for this, The numbers on the DPI Engine is correct, but on Web Proxy is slower than I wrote before, now It makes more sense. 
 Edit: Also, good job for the Sophos Devs, the difference between the Legacy Web Proxy and the new DPI Engine is impressive. 
 Edit 2: Some additional information about this, which makes the difference between Web Proxy and DPI Engine even more impressive, I've used TLS v1.3 with the DPI Engine, while on Web Proxy has TLS v1.2. 
 Edit 3: Web Proxy has using this Chiper/Auth combination: TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, 256 bit Keys, TLS v1.2; 
 Edit 3: DPI Engine has using this Chiper/Auth combination: TLS_AES_256_GCM_SHA384, 256 bit Keys, TLS v1.3; 
 Edit 4 : I will try to force TLS v1.2 on the DPI Engine to see if there's even a higher difference.

XG v18 EAP3 Refresh-1 
 Web Proxy TLS v1.2 
 DPI Engine TLS v1.2 
 DPI Engine TLS v1.3

CPS with/AV 
 
 275~ 
 
 980~ 
 750~

CPS without/AV 
 430~ 
 4700~ 
 4100~

Latency >90%/Connections 
 0.070/Sec 
 0.014/Sec 
 0.101/Sec

Disabling HTTP(s) scanning, and there you go, more than 5x the CPS while using the DPI Engine. It feels like the current XG AV is holding the DPI Engine back, and not letting it to have the performance that It's capable of. 
 After all this *** writing, my question is, is this expected? Will XG keep using "avd" as the AV for HTTP(s), IMAP, and so on? I'm not saying It's wrong, or bad, again, It's just weird. Also if It is, then why - " Single high-performance streaming DPI engine with proxyless scanning of all traffic for AV ", In my understatement of what has said on this, shouldn't Snort be also taking care of AV? 
 --- 
 Second Question: 
 
 Why the hell "avd" uses 99.9% of a single-core while scanning .txt files? 
 
 This becomes a issue when your running any linux distro, which the package manager downloads .txt files to know if there's any package to upgrade. The CPU usage of a single core goes all the way up to 100%. 
 A single "pacman -Syu" which at first download 4x .txt files, can take up to 45 seconds, and It's only at limit 6MB of total size, (I'm on a 400/200Mbit/s WAN, and the package manager mirror is capable to push my link to it's limits.) 
 This doesn't happen with any other kind of file format, hell, .exe scanning feels like it's instant compared to .txt 
 --- 
 Third Question: 
 
 Why "avd" is a single-core service? 
 --- 
 
 That's It. 
 Again, It feels like all of this is all expected, but I'm just a home user, so I feel like I shouldn't be complaining that much in here. 
 Also 750~ CPS is more than sufficient for a Home network. 
 
 Thanks.

Prism · Accepted Answer

rfcat_vk said: 1/. AV - I assume you mean anti-virus not audio and video? 
 Exactly, Anti-Virus. 
 rfcat_vk said: 2/. CPS - characters per second, cycle per second? 
 Connections per second. 
 
 I didn't measured HTTP Transactions per second, since, well I'm not nss labs. 
 
 Thanks,

Michael Dunn · Answer

Quick answers: 
 There is one instance of the AV engine running in a process called avd. Web proxy, FTP proxy, mail proxy, and DPI mode all call into the one AV instance, so that we don't need to run four copies on the box. Also remember there is configuration such as Dual scan or Single scan with which AV engine you use (Sophos or Avira). Yes, that is how it will continue to work. There is very little performance impact to whether the AV thread is a part of the snort process or a part of the avd process (or rather, other things are bigger impact). 
 When you enable HTTPS scanning the system needs to do a lot of SSL decryption which takes CPU cycles (and lower CPS). In addition, it means that files will be AV scanned, which takes CPU (and lowers CPS). However I suspect that the decryption part is the bigger factor. Do not enable/disable HTTPS scanning and then claim you are measuring with and without AV. If you wanted to measure the impact of AV, then leave Decryption on and toggle the "Scan HTTP and decrypted HTTPS" to turn on/off AV. Make sure that you are using a web policy (eg not set to None). 
 As for avd using 99% cpu that might be an artifact of "top". Can you give me a real-world example/impact? eg a specific curl for a text file that tool a long time to scan.

Michael Dunn · Answer

As an aside, 
 For any connection handled by snort, there are several different processes that are called out to. 
 
 Starting at connection (eg the first packet from the client) we are doing checking authentication, which could potentially call out to two different processes. Note if the client re-uses the connection this cost does not occur again. 
 Starting at the request (eg the first packet from the client) we are doing web categorization, a call out to a different process. This could potentially even call out to make a request to a cloud server. 
 At the end of the request (eg the last packet from the server) we are doing AV scanning, a call out to a different process. 
 There might be other processes that snort calls out to, but only one per request. 
 On the other hand decryption is something that happens on every single packet. 
 
 For decryption, if you are downloading 100 1MB files or 1 100MB file I think the CPU cost of decryption is roughly the same. But for the other costs, the traffic type changes things.

Michael Dunn · Answer

I wrote a lengthier reply, most of it informational about packet processing and ultimately not important. 
 
 Rest assured that the architecture people know their stuff and are running lots of testing and optimization. Remember also that XG is also meant for customers with 5000 clients all simultaneously downloading things and 100 things being AV scanned at the same time. How things look in a one client test may max out things in a scenario that is not very real world. 
 IIRC there are several parts of XG that look at the number of CPUs and cores and change behavior. How many threads on customer-hardware may be different than on similar XG hardware, and certainly different between an XG110 and a XG750. 
 Though it is interesting to speculate, at some point you have to trust that we know what we are doing. :)