Internet becomes unresponsive after several days?

This is the second time this has occurred since using v18 EAP. I've also had this issue occur a couple times when running v17 but it wasn't as frequent. With v18 EAP, after Sophos XG has been running for several days (over a week), sometimes the internet becomes unresponsive as in I can't access anything. For example, if I try to access a website, it just continues trying to load and eventually times out. At first, I thought it was an ISP issue so I would reset my cable modem but that didn't fix the issue. I can still access devices on my local network just fine, such as the Sophos XG web UI. What I did notice in the web UI is the "Sessions" count under System in the Control Center indicates a very high number when I'm having these issues. It seems to fluctuate from ~800 up to 2.5k. I have about 30-40 devices on my network (one computer, mobile devices, smart home devices, etc.). Typically, my Sessions count is somewhere around 20-50. After restarting Sophos XG, the count goes back down to what I normally see and everything works fine.

Anyone else experiencing similar issues? Is there any specific log I can save when this issue occurs? Unfortunately, I'm running this on my home network so I can't just leave it in an unusable state.

Parents

0 GavinDaniels over 4 years ago

Hi there

What country are you in?

What is the model of the cable modem?

How is the WAN interface in the Sophos configured?

Regards,

Gavin Daniels. DipIT(Networking)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 shred over 4 years ago in reply to GavinDaniels

I’m located in USA. It’s a Motorola MB8600 cable modem and the WAN interface is the default configuration. The only addition I made was adding an IPv6 interface.

---

Sophos XG guides for home users: https://shred086.wordpress.com/
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 GavinDaniels over 4 years ago in reply to shred

Hi,

In Australia we use HFC cable and Arrias Cable Modems. Several ISP's will use PPPOE Authentication for their connections.

I found that I would get a lockup of the interface for a while when the WAN port was configured at a 1500MTU size. Dropping it to a 1492MTU size to allow for the 8 byte pppoe Header.

When I moved to another carrier who also uses PPPOE over a Vlan connection, I needed to reduce the MTU size to 1460.

Which is also where I added the forum request to make the MTU size of an unconfigured WAN port adjustable. As configuring the hardware port as WAN or DMZ with a static IP was a waste,

Regards,

Gavin Daniels. DipIT(Networking)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 shred over 4 years ago in reply to GavinDaniels

Ah, I see. My ISP is Cox which does not use PPoE.

Rana Sharma, The issue started occurring again last night. Same issue and symptoms. I left everything alone to see if the problem would still be there in the morning and sure enough, it was. I left everything as is for about an hour while I was digging around to see if I could see anything abnormal, but nothing appeared out of ordinary other than the increasing/decreasing session counts that I also see associated with this issue. Instead of restarting the entire thing, I tried restarting just the ips service from the advanced shell and after doing so, it appears the issue is resolved. Session counts are back down to “normal” levels and I also noticed the memory usage dropped from ~50% to ~38%. I did take a Consolidated Troubleshooting Report when the issue was occurring.

So, as far as I can tell, the issue appears to be caused by the ips service that is resolved with an ips service restart. Here are my ips settings:

Sophos Firmware Version SFOS 18.0.0 EAP3-Refresh1

console> show ips-settings
-------------IPS Settings-------------
stream on
lowmem off
maxsesbytes 0
maxpkts 8
enable_appsignatures on
http_response_scan_limit 65535
search_method hyperscan
sip_preproc enabled
sip_ignore_call_channel enabled
inspect untrusted-content

-------------IPS Instances------------
IPS CPU
1 0
2 1
3 2
4 3

---

Sophos XG guides for home users: https://shred086.wordpress.com/
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 shred over 4 years ago in reply to shred

Well, restarting the ips service seemed to have fixed it for about an hour. Unfortunately, the issue is back.

Edit: So I restarted the ips service again and everything has been working fine for the past two hours.

I’m wondering if the issue is with ATP. Before I was having the issue last night, I enabled ATP and the issue started occurring shortly after. I notice when I enable ATP, the memory usage jumps up quite a bit. Even after disabling ATP, the issue remained and the memory usage remained as well. This morning when I restarted the ips service to see if it fixed the issue, I did the same thing of enabling ATP and shortly after the issue started occurring. Same symptoms after disabling ATP (increased memory usage remained, issue still existed). I restarted the ips service this last time but left ATP enabled and so far, everything seems to be working fine.

---

Sophos XG guides for home users: https://shred086.wordpress.com/
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Akilae over 4 years ago in reply to shred

Seems like I experienced the exact same issues after activating ATP on EAP3 and I have left ATP offline since then. Gonna try it again now if it still persists. I notice it that my internet connected IoT devices like Tado heating and so on are suddenly offline. This gets fixed turning off the ATP. If I don‘t do that sooner or later normal browsing gets affected as well.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 rfcat_vk over 4 years ago in reply to Akilae

Hi,

there is a bug in the ATP which will be fixed in the V18 GA I am advised. You will need to restart your XG if you disable ATP and re-enable it.

Ian

XG115W - v20.0.3 MR-3 - Home

XG on VM 8 - v21 GA

If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michael Dunn over 4 years ago in reply to Akilae

There is a known issue, fixed in GA, with ATP that may be related. I think shred is experiencing this but I don't know if it is the cause of the connection/unresponsive.

Take a look at /log/ips.log. If you see a lot of "failed to get sessiontbl data for session id" then you may be experiencing the issue.

Workaround:

Advanced Threat > Enable Advanced Threat protection : Off

System Services > Services > IPS : Stop and then Start
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Reply

0 Michael Dunn over 4 years ago in reply to Akilae

There is a known issue, fixed in GA, with ATP that may be related. I think shred is experiencing this but I don't know if it is the cause of the connection/unresponsive.

Take a look at /log/ips.log. If you see a lot of "failed to get sessiontbl data for session id" then you may be experiencing the issue.

Workaround:

Advanced Threat > Enable Advanced Threat protection : Off

System Services > Services > IPS : Stop and then Start
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Children

0 shred over 4 years ago in reply to Michael Dunn

I think I may be having some other issues as well based on some information from Michael Dunn via PM but it also looks like I have the "failed to get sessiontbl data for session id" in my ips logs. Example:

[Feb 03 16:47:37 :29704]:failed to get sessiontbl data for session id 528 rev 64837,dropping packet

[Feb 03 16:47:37 :29704]:failed to get sessiontbl data for session id 528 rev 64837,dropping packet

[Feb 03 17:23:12 :29703]:failed to get sessiontbl data for session id 574 rev 36762,dropping packet

[Feb 03 17:23:12 :29703]:failed to get sessiontbl data for session id 574 rev 36762,dropping packet

[Feb 03 17:23:12 :29703]:failed to get sessiontbl data for session id 574 rev 36762,dropping packet

[Feb 03 17:23:42 :29704]:failed to get sessiontbl data for session id 350 rev 3155,dropping packet

[Feb 03 17:25:49 :29701]:failed to get sessiontbl data for session id 146 rev 36239,dropping packet

[Feb 03 17:25:49 :29701]:failed to get sessiontbl data for session id 462 rev 38120,dropping packet

However, I've had ATP off for the past couple days (and restarted the IPS service after turning it off). I haven't had any issues with the internet being unresponsive since then.

---

Sophos XG guides for home users: https://shred086.wordpress.com/
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michael Dunn over 4 years ago in reply to shred

Assuming that shred does not get this again with ATP off, we can assume that is the cause. Tracked in NC-55333 and already fixed in GA.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 shred over 4 years ago in reply to Michael Dunn

I upgraded to v18 GA yesterday and enabled ATP. Everything seems to be working okay so far but checking my ips logs tonight, I see a bunch of the messages below. No internet unresponsiveness issues yet.

[Feb 19 14:31:03 :7535]:Error reading session data,status -1

[Feb 19 14:31:03 :7535]:failed to get sessiontbl data for session id 340 rev 59828 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:31:36 :7534]:Error reading session data,status -1

[Feb 19 14:31:36 :7534]:failed to get sessiontbl data for session id 92 rev 57781 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:31:36 :7535]:Error reading session data,status -1

[Feb 19 14:31:36 :7535]:failed to get sessiontbl data for session id 1487 rev 12236 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:39:14 :7537]:Error reading session data,status -1

[Feb 19 14:39:14 :7537]:failed to get sessiontbl data for session id 1488 rev 1869 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:39:31 :7535]:Error reading session data,status -1

[Feb 19 14:39:31 :7535]:failed to get sessiontbl data for session id 121 rev 2682 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:40:07 :7535]:Error reading session data,status -1

[Feb 19 14:40:07 :7535]:failed to get sessiontbl data for session id 377 rev 50558 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:40:07 :7534]:Error reading session data,status -1

[Feb 19 14:40:07 :7534]:failed to get sessiontbl data for session id 376 rev 50558 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:40:07 :7537]:Error reading session data,status -1

[Feb 19 14:40:07 :7537]:failed to get sessiontbl data for session id 863 rev 2481 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 14:41:49 :7536]:Error reading session data,status -1

[Feb 19 14:41:49 :7536]:failed to get sessiontbl data for session id 233 rev 37009 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

1582153705.901504922 [ 7537/0x0] [nsg_nse_policy.c:1312:__nsg_error] 172.16.16.25:53859 to 75.2.53.94:443: Error from nse: NSE:Internal [0xb0000582;code:130;sub:5] Flow timeout

[Feb 19 15:11:24 :7534]:Error reading session data,status -1

[Feb 19 15:11:24 :7534]:failed to get sessiontbl data for session id 509 rev 19812 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 15:11:24 :7534]:Error reading session data,status -1

[Feb 19 15:11:24 :7534]:failed to get sessiontbl data for session id 854 rev 2453 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 15:11:24 :7534]:Error reading session data,status -1

[Feb 19 15:11:24 :7534]:failed to get sessiontbl data for session id 497 rev 20104 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 15:11:24 :7535]:Error reading session data,status -1

[Feb 19 15:11:24 :7535]:failed to get sessiontbl data for session id 499 rev 20110 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 15:11:24 :7537]:Error reading session data,status -1

[Feb 19 15:11:24 :7537]:failed to get sessiontbl data for session id 505 rev 20042 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

1582156094.971507455 [ 7537/0x0] [nsg_nse_policy.c:1312:__nsg_error] 2600:8801:7f06:1:f04c:135:d6da:3b53:50053 to 2607:fb90:c13f:fff6::2:443: Error from nse: NSE:Internal [0xb0000582;code:130;sub:5] Flow timeout

[Feb 19 15:48:14 :7537]:Error reading session data,status -1

[Feb 19 15:48:14 :7537]:failed to get sessiontbl data for session id 305 rev 38341 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 15:48:14 :7537]:Error reading session data,status -1

[Feb 19 15:48:14 :7537]:failed to get sessiontbl data for session id 305 rev 38341 pkt_len 0 datalink_type 229 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 16:08:59 :7534]:Error reading session data,status -1

[Feb 19 16:08:59 :7534]:failed to get sessiontbl data for session id 558 rev 33231 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 16:09:30 :7536]:Error reading session data,status -1

[Feb 19 16:09:30 :7536]:failed to get sessiontbl data for session id 501 rev 20258 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

[Feb 19 17:14:46 :7536]:Error reading session data,status -1

[Feb 19 17:14:46 :7536]:failed to get sessiontbl data for session id 381 rev 40659 pkt_len 0 datalink_type 228 direction 0 daq_source 2 is_tcp 0 nseid 0 is_ssl_non_app_appdata 0, dropping packet

1582163910.312337972 [ 7534/0xe1f900000057] [nsg_tcphold.c:314:process_event] Could not find session for key and unique_id.

---

Sophos XG guides for home users: https://shred086.wordpress.com/
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel
0 Michael Dunn over 4 years ago in reply to shred

Hi shred. We have seen that internally as well. Assuming that it is the same cause, you can ignore that error. It can occur when both sides of a connection close simultaneously, 16 seconds later something times out and prints that.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel