Hi there,Sophos XG230 and v19.01.We have here permanently 30% CPU from garner process.Looking closer with "tail" you can see the following.
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
I think that is not correct.How to get this problem solved ?thx
Stefan
Hello StefanS ,Thank you for reaching out to the community, On the CLI, select option 5. Device Management, then option 3. Advanced Shell. type the command: service garner:restart -ds nosyncAnd then check the logs again !!
Thanks & Regards,_______________________________________________________________
Vivek Jagad | Team Lead, Global Support & Services
Sophos Community | Product Documentation | Sophos Techvids | SMSIf a post solves your question please use the 'Verify Answer' button.
nope, still cache error.
Can you share the following outputs:1.) df -kh2.) ls -larth /var/cores3.) tail -f /log/garner.log4.) tail -f /log/syslog.log5.) tail -f applog.log
df -kh
df -kh Filesystem Size Used Available Use% Mounted on none 1.6G 14.3M 1.5G 1% / none 3.8G 28.0K 3.8G 0% /dev none 3.8G 21.0M 3.8G 1% /tmp none 3.8G 14.6M 3.8G 0% /dev/shm /dev/boot 127.7M 31.9M 93.0M 26% /boot /dev/mapper/mountconf 957.7M 77.1M 876.6M 8% /conf /dev/content 11.2G 424.2M 10.8G 4% /content /dev/var 87.1G 29.2G 57.9G 34% /var
ls -larth /var/cores-rw------- 1 root 0 888.7M Jul 27 13:44 core.snort drwxrwxrwt 2 root 0 4.0K Jul 27 23:58 . drwxr-xr-x 41 root 0 4.0K Nov 14 10:42 ..
-rw------- 1 root 0 888.7M Jul 27 13:44 core.snort drwxrwxrwt 2 root 0 4.0K Jul 27 23:58 . drwxr-xr-x 41 root 0 4.0K Nov 14 10:42 ..
tail -f /log/garner.logusercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed Nov 14 11:05:42Z: OPPOSTGRES: move_table_to_usedqueue: moving table 'available_fwapplicationv7_1668251090' FD: 14 Nov 14 11:05:42Z: OPPOSTGRES: move_table_to_usedqueue: table 'available_fwapplicationv7_1668251090' is moved to 'tbl_used_fwapplicationv7' queue usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed Nov 14 11:05:42Z: OPPOSTGRES: move_table_to_usedqueue: moving table 'available_fwapplicationv7_1668251090' FD: 14 Nov 14 11:05:42Z: OPPOSTGRES: move_table_to_usedqueue: table 'available_fwapplicationv7_1668251090' is moved to 'tbl_used_fwapplicationv7' queue usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed usercache_output: resolve_gr_cache for FW_GW_MODULE failed usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
tail -f /log/syslog.logNov 14 10:45:13Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:45:29Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:49:11Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:50:20Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:51:54Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:55:37Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:59:17Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 11:00:25Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 11:01:26Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 11:06:45Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory
Nov 14 10:45:13Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:45:29Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:49:11Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:50:20Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:51:54Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:55:37Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 10:59:17Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 11:00:25Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 11:01:26Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory Nov 14 11:06:45Z localhost exim: looking for plugins in '/usr/lib/sasl2', failed to open directory, error: No such file or directory
tail -f applog.log Nov 14 11:00:01Z apiInterface:request mode -> 1201. Nov 14 11:00:01Z apiInterface:Current ver :::'1500.1' Nov 14 11:00:01Z apiInterface:entityjson::::::::cli::alert=HASH(0xb71dd00) Nov 14 11:00:01Z Info:: Transaction will not be rolled back for opcode setAlertSettings. If any operation fails, request is part of multiple request : Nov 14 11:00:06Z String Built : No|LiveUserids|BWids 1|2,25,24|10,0,10 Nov 14 11:00:11Z Checking new IPs of fqdn for mta Nov 14 11:01:12Z getpublickey success Key: 79e509a6409a522f667eb9d53c95aa487eeb Nov 14 11:03:04Z String Built : No|LiveUserids|BWids 1|24,25|10,0 Nov 14 11:05:10Z Checking new IPs of fqdn for mta Nov 14 11:06:04Z String Built : No|LiveUserids|BWids 1|25,24|0,0
Nov 14 11:00:01Z apiInterface:request mode -> 1201. Nov 14 11:00:01Z apiInterface:Current ver :::'1500.1' Nov 14 11:00:01Z apiInterface:entityjson::::::::cli::alert=HASH(0xb71dd00) Nov 14 11:00:01Z Info:: Transaction will not be rolled back for opcode setAlertSettings. If any operation fails, request is part of multiple request : Nov 14 11:00:06Z String Built : No|LiveUserids|BWids 1|2,25,24|10,0,10 Nov 14 11:00:11Z Checking new IPs of fqdn for mta Nov 14 11:01:12Z getpublickey success Key: 79e509a6409a522f667eb9d53c95aa487eeb Nov 14 11:03:04Z String Built : No|LiveUserids|BWids 1|24,25|10,0 Nov 14 11:05:10Z Checking new IPs of fqdn for mta Nov 14 11:06:04Z String Built : No|LiveUserids|BWids 1|25,24|0,0
Do you have IPsec tunnels configured if yes then how many ?Are they policy base or tunnel base IPsec tunnels ?And have you just updated a firmware from v18.5.4 MR-4 to v19.0.1 MR-1 ?Are you facing this issue after the upgrade ?
>Do you have IPsec tunnels configured if yes then how many ?three IPsec cross-site and one for Sophos Connect Remote Ipsec>Are they policy base or tunnel base IPsec tunnels ?route base VPN>And have you just updated a firmware from v18.5.4 MR-4 to v19.0.1 MR-1 ?from v18.5.2>Are you facing this issue after the upgrade ?no
How many route base tunnels are configured ?Then Since when did you notice this logs generating?As garner service is responsible for the logging and reporting part, have you faced any problems in generating reports from the reports dashboard or missing logs from the log viewer section ?
>How many route base tunnels are configured ?only two, but with more routes and larger subnets.>Then Since when did you notice this logs generating?After switching to v19.01, i noticed that on average, when the users are in the company, we have 50-60% CPU load.That was about 15- 20% less before.What is also noticeable in this context is that snort and garner are the main causes of the high CPU load.
Snort is responsible for IPS.During this increase in CPU load, have you faced such troubles like:1.) Access to the web GUI FW ?2.) ssh access ?3.) Number of users ? - Do you they face any troubles accessing the internet or any slowness ?4.) Network impact - LAN/WAN ?
>Snort is responsible for IPS.i know :)Regarding points 1,2 and 4.No problems at all, except this garner and snort CPU problem.
To point 3.120 users, currently about 20 in HO (home office).
That should not be a problem for the XG230.What i read / see, however, that there are always complaints with garner / snort in terms of high CPU utilization.This is already an older problem.
Is there at least a chance to solve the garner / cache problem ?
Can you share the out of the: less garner.log | grep threshold> Screenshot of the system services > log settings
Hey StefanS also the output of - less garner.log | grep threshold Are you using all the modules/services and do you require logging of each and every option ticked ?
ess garner.log | grep threshold
less garner.log | grep threshold sethreshold_event: timestamp: '1668397800' --> SE COUNT: '43461', Drop events count 0 sethreshold_event: timestamp: '1668397801' --> SE COUNT: '774', Drop events count 0 sethreshold_event: timestamp: '1668397860' --> SE COUNT: '42811', Drop events count 0 sethreshold_event: timestamp: '1668397920' --> SE COUNT: '42574', Drop events count 0 sethreshold_event: timestamp: '1668397981' --> SE COUNT: '44601', Drop events count 0 sethreshold_event: timestamp: '1668398040' --> SE COUNT: '42601', Drop events count 0 sethreshold_event: timestamp: '1668398100' --> SE COUNT: '44057', Drop events count 0 sethreshold_event: timestamp: '1668398101' --> SE COUNT: '659', Drop events count 0 sethreshold_event: timestamp: '1668398161' --> SE COUNT: '42348', Drop events count 0 sethreshold_event: timestamp: '1668398221' --> SE COUNT: '43394', Drop events count 0 sethreshold_event: timestamp: '1668398281' --> SE COUNT: '43059', Drop events count 0 sethreshold_event: timestamp: '1668398340' --> SE COUNT: '42813', Drop events count 0 sethreshold_event: timestamp: '1668398400' --> SE COUNT: '44245', Drop events count 0 sethreshold_event: timestamp: '1668398401' --> SE COUNT: '724', Drop events count 0 sethreshold_event: timestamp: '1668398461' --> SE COUNT: '42706', Drop events count 0 sethreshold_event: timestamp: '1668398521' --> SE COUNT: '42717', Drop events count 0 sethreshold_event: timestamp: '1668398581' --> SE COUNT: '43595', Drop events count 0 sethreshold_event: timestamp: '1668398641' --> SE COUNT: '43426', Drop events count 0 sethreshold_event: timestamp: '1668398700' --> SE COUNT: '43507', Drop events count 0 sethreshold_event: timestamp: '1668398701' --> SE COUNT: '360', Drop events count 0 sethreshold_event: timestamp: '1668398760' --> SE COUNT: '42634', Drop events count 0 sethreshold_event: timestamp: '1668398820' --> SE COUNT: '43458', Drop events count 0 sethreshold_event: timestamp: '1668398881' --> SE COUNT: '43343', Drop events count 0 sethreshold_event: timestamp: '1668398941' --> SE COUNT: '44051', Drop events count 0 sethreshold_event: timestamp: '1668399001' --> SE COUNT: '44058', Drop events count 0 sethreshold_event: timestamp: '1668399002' --> SE COUNT: '429', Drop events count 0 sethreshold_event: timestamp: '1668399060' --> SE COUNT: '42286', Drop events count 0 sethreshold_event: timestamp: '1668399120' --> SE COUNT: '42492', Drop events count 0 sethreshold_event: timestamp: '1668399181' --> SE COUNT: '43077', Drop events count 0 sethreshold_event: timestamp: '1668399241' --> SE COUNT: '43979', Drop events count 0 sethreshold_event: timestamp: '1668399300' --> SE COUNT: '43729', Drop events count 0 sethreshold_event: timestamp: '1668399301' --> SE COUNT: '548', Drop events count 0 sethreshold_event: timestamp: '1668399360' --> SE COUNT: '43171', Drop events count 0 sethreshold_event: timestamp: '1668399420' --> SE COUNT: '42865', Drop events count 0 sethreshold_event: timestamp: '1668399480' --> SE COUNT: '42672', Drop events count 0 sethreshold_event: timestamp: '1668399541' --> SE COUNT: '44439', Drop events count 0 sethreshold_event: timestamp: '1668399601' --> SE COUNT: '44283', Drop events count 0 sethreshold_event: timestamp: '1668399601' --> SE COUNT: '38', Drop events count 0 sethreshold_event: timestamp: '1668399660' --> SE COUNT: '42354', Drop events count 0 sethreshold_event: timestamp: '1668399720' --> SE COUNT: '42993', Drop events count 0 sethreshold_event: timestamp: '1668399780' --> SE COUNT: '43841', Drop events count 0 sethreshold_event: timestamp: '1668399840' --> SE COUNT: '43175', Drop events count 0 sethreshold_event: timestamp: '1668399901' --> SE COUNT: '43590', Drop events count 0 sethreshold_event: timestamp: '1668399902' --> SE COUNT: '1042', Drop events count 0 sethreshold_event: timestamp: '1668399962' --> SE COUNT: '43042', Drop events count 0 sethreshold_event: timestamp: '1668400021' --> SE COUNT: '42922', Drop events count 0 sethreshold_event: timestamp: '1668400080' --> SE COUNT: '42854', Drop events count 0 sethreshold_event: timestamp: '1668400140' --> SE COUNT: '43417', Drop events count 0 sethreshold_event: timestamp: '1668400201' --> SE COUNT: '43764', Drop events count 0 sethreshold_event: timestamp: '1668400202' --> SE COUNT: '793', Drop events count 0 sethreshold_event: timestamp: '1668400261' --> SE COUNT: '42428', Drop events count 0 sethreshold_event: timestamp: '1668400321' --> SE COUNT: '43708', Drop events count 0 sethreshold_event: timestamp: '1668400381' --> SE COUNT: '44033', Drop events count 0 sethreshold_event: timestamp: '1668400440' --> SE COUNT: '42492', Drop events count 0 sethreshold_event: timestamp: '1668400500' --> SE COUNT: '43310', Drop events count 0 sethreshold_event: timestamp: '1668400501' --> SE COUNT: '1050', Drop events count 0 sethreshold_event: timestamp: '1668400561' --> SE COUNT: '42383', Drop events count 0 sethreshold_event: timestamp: '1668400621' --> SE COUNT: '43137', Drop events count 0 sethreshold_event: timestamp: '1668400681' --> SE COUNT: '42849', Drop events count 0 sethreshold_event: timestamp: '1668400741' --> SE COUNT: '44394', Drop events count 0 sethreshold_event: timestamp: '1668400800' --> SE COUNT: '42627', Drop events count 0 sethreshold_event: timestamp: '1668400801' --> SE COUNT: '863', Drop events count 0
So there are no drop events, as it has not exceeded the threshold. Yes, I would request you to untick the options which are not required at least on the local reporting side !!
StefanS The log "usercache_output: resolve_gr_cache for FW_PBR_MODULE failed" seems not normal . is it possible to give support access to device? Engineering team would like to check on what is causing that log to continuously . - Shrikant
Certainly no problem.Where should i send the login data ?
You can PM directly to me or ShrikantSophos Support access id enablement - https://docs.sophos.com/nsg/sophos-firewall/19.0/Help/en-us/webhelp/onlinehelp/AdministratorHelp/Diagnostics/ConnectionList/SupportAccess/index.htmlcopy/paste in PM StefanS
Vivek, you have a PM
Thank you, we will check and revert you soon !
Hello StefanS
So here are the points of investigation: 1.) Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes2.) About the high-cpu usage of garner post-upgrade - with the current enabled-logging, apparently this utilization is expected, So I hope you have disable the unnecessary logging on the local reporting as discussed previously. *Note: these resolve-failure do not contribute much to garner-load as you suspected !
Workaround/Suggestions:
Convert migrated policy routes to sdwan-route and delete migrated policy routes