This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[garner] konstant 30% CPU, resolve Cache error

Hi there,
Sophos XG230 and v19.01.
We have here permanently 30% CPU from garner process.
Looking closer with "tail" you can see the following.

usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_GW_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed

I think that is not correct.
How to get this problem solved ?

thx

Stefan

This thread was automatically locked due to age.

Top Replies

Vivek Jagad over 1 year ago in reply to Vivek Jagad +1 verified

Hello StefanS So here are the points of investigation: 1.) Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes 2.) About the high-cpu usage of garner post…

Parents

0 Vivek Jagad over 1 year ago

Hello StefanS ,

Thank you for reaching out to the community, On the CLI, select option 5. Device Management, then option 3. Advanced Shell.
type the command: service garner:restart -ds nosync

And then check the logs again !!

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel
0 StefanS over 1 year ago in reply to Vivek Jagad

nope, still cache error.
Cancel
Vote Up 0 Vote Down

Cancel
0 Vivek Jagad over 1 year ago in reply to StefanS

Snort is responsible for IPS.
During this increase in CPU load, have you faced such troubles like:
1.) Access to the web GUI FW ?
2.) ssh access ?
3.) Number of users ? - Do you they face any troubles accessing the internet or any slowness ?
4.) Network impact - LAN/WAN ?

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel
0 StefanS over 1 year ago in reply to Vivek Jagad

>Snort is responsible for IPS.
i know :)
Regarding points 1,2 and 4.
No problems at all, except this garner and snort CPU problem.

To point 3.
120 users, currently about 20 in HO (home office).

That should not be a problem for the XG230.
What i read / see, however, that there are always complaints with garner / snort in terms of high CPU utilization.
This is already an older problem.
Cancel
Vote Up 0 Vote Down

Cancel
0 StefanS over 1 year ago in reply to Vivek Jagad

Is there at least a chance to solve the garner / cache problem ?
Cancel
Vote Up 0 Vote Down

Cancel
0 Vivek Jagad over 1 year ago in reply to StefanS

Can you share the out of the: less garner.log | grep threshold

> Screenshot of the system services > log settings

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel
0 StefanS over 1 year ago in reply to Vivek Jagad
Cancel
Vote Up 0 Vote Down

Cancel
0 Vivek Jagad over 1 year ago in reply to StefanS

Hey StefanS also the output of - less garner.log | grep threshold
Are you using all the modules/services and do you require logging of each and every option ticked ?

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel

0 StefanS over 1 year ago in reply to Vivek Jagad

ess garner.log | grep threshold

less garner.log | grep threshold
sethreshold_event: timestamp: '1668397800' --> SE COUNT: '43461', Drop events count 0
sethreshold_event: timestamp: '1668397801' --> SE COUNT: '774', Drop events count 0
sethreshold_event: timestamp: '1668397860' --> SE COUNT: '42811', Drop events count 0
sethreshold_event: timestamp: '1668397920' --> SE COUNT: '42574', Drop events count 0
sethreshold_event: timestamp: '1668397981' --> SE COUNT: '44601', Drop events count 0
sethreshold_event: timestamp: '1668398040' --> SE COUNT: '42601', Drop events count 0
sethreshold_event: timestamp: '1668398100' --> SE COUNT: '44057', Drop events count 0
sethreshold_event: timestamp: '1668398101' --> SE COUNT: '659', Drop events count 0
sethreshold_event: timestamp: '1668398161' --> SE COUNT: '42348', Drop events count 0
sethreshold_event: timestamp: '1668398221' --> SE COUNT: '43394', Drop events count 0
sethreshold_event: timestamp: '1668398281' --> SE COUNT: '43059', Drop events count 0
sethreshold_event: timestamp: '1668398340' --> SE COUNT: '42813', Drop events count 0
sethreshold_event: timestamp: '1668398400' --> SE COUNT: '44245', Drop events count 0
sethreshold_event: timestamp: '1668398401' --> SE COUNT: '724', Drop events count 0
sethreshold_event: timestamp: '1668398461' --> SE COUNT: '42706', Drop events count 0
sethreshold_event: timestamp: '1668398521' --> SE COUNT: '42717', Drop events count 0
sethreshold_event: timestamp: '1668398581' --> SE COUNT: '43595', Drop events count 0
sethreshold_event: timestamp: '1668398641' --> SE COUNT: '43426', Drop events count 0
sethreshold_event: timestamp: '1668398700' --> SE COUNT: '43507', Drop events count 0
sethreshold_event: timestamp: '1668398701' --> SE COUNT: '360', Drop events count 0
sethreshold_event: timestamp: '1668398760' --> SE COUNT: '42634', Drop events count 0
sethreshold_event: timestamp: '1668398820' --> SE COUNT: '43458', Drop events count 0
sethreshold_event: timestamp: '1668398881' --> SE COUNT: '43343', Drop events count 0
sethreshold_event: timestamp: '1668398941' --> SE COUNT: '44051', Drop events count 0
sethreshold_event: timestamp: '1668399001' --> SE COUNT: '44058', Drop events count 0
sethreshold_event: timestamp: '1668399002' --> SE COUNT: '429', Drop events count 0
sethreshold_event: timestamp: '1668399060' --> SE COUNT: '42286', Drop events count 0
sethreshold_event: timestamp: '1668399120' --> SE COUNT: '42492', Drop events count 0
sethreshold_event: timestamp: '1668399181' --> SE COUNT: '43077', Drop events count 0
sethreshold_event: timestamp: '1668399241' --> SE COUNT: '43979', Drop events count 0
sethreshold_event: timestamp: '1668399300' --> SE COUNT: '43729', Drop events count 0
sethreshold_event: timestamp: '1668399301' --> SE COUNT: '548', Drop events count 0
sethreshold_event: timestamp: '1668399360' --> SE COUNT: '43171', Drop events count 0
sethreshold_event: timestamp: '1668399420' --> SE COUNT: '42865', Drop events count 0
sethreshold_event: timestamp: '1668399480' --> SE COUNT: '42672', Drop events count 0
sethreshold_event: timestamp: '1668399541' --> SE COUNT: '44439', Drop events count 0
sethreshold_event: timestamp: '1668399601' --> SE COUNT: '44283', Drop events count 0
sethreshold_event: timestamp: '1668399601' --> SE COUNT: '38', Drop events count 0
sethreshold_event: timestamp: '1668399660' --> SE COUNT: '42354', Drop events count 0
sethreshold_event: timestamp: '1668399720' --> SE COUNT: '42993', Drop events count 0
sethreshold_event: timestamp: '1668399780' --> SE COUNT: '43841', Drop events count 0
sethreshold_event: timestamp: '1668399840' --> SE COUNT: '43175', Drop events count 0
sethreshold_event: timestamp: '1668399901' --> SE COUNT: '43590', Drop events count 0
sethreshold_event: timestamp: '1668399902' --> SE COUNT: '1042', Drop events count 0
sethreshold_event: timestamp: '1668399962' --> SE COUNT: '43042', Drop events count 0
sethreshold_event: timestamp: '1668400021' --> SE COUNT: '42922', Drop events count 0
sethreshold_event: timestamp: '1668400080' --> SE COUNT: '42854', Drop events count 0
sethreshold_event: timestamp: '1668400140' --> SE COUNT: '43417', Drop events count 0
sethreshold_event: timestamp: '1668400201' --> SE COUNT: '43764', Drop events count 0
sethreshold_event: timestamp: '1668400202' --> SE COUNT: '793', Drop events count 0
sethreshold_event: timestamp: '1668400261' --> SE COUNT: '42428', Drop events count 0
sethreshold_event: timestamp: '1668400321' --> SE COUNT: '43708', Drop events count 0
sethreshold_event: timestamp: '1668400381' --> SE COUNT: '44033', Drop events count 0
sethreshold_event: timestamp: '1668400440' --> SE COUNT: '42492', Drop events count 0
sethreshold_event: timestamp: '1668400500' --> SE COUNT: '43310', Drop events count 0
sethreshold_event: timestamp: '1668400501' --> SE COUNT: '1050', Drop events count 0
sethreshold_event: timestamp: '1668400561' --> SE COUNT: '42383', Drop events count 0
sethreshold_event: timestamp: '1668400621' --> SE COUNT: '43137', Drop events count 0
sethreshold_event: timestamp: '1668400681' --> SE COUNT: '42849', Drop events count 0
sethreshold_event: timestamp: '1668400741' --> SE COUNT: '44394', Drop events count 0
sethreshold_event: timestamp: '1668400800' --> SE COUNT: '42627', Drop events count 0
sethreshold_event: timestamp: '1668400801' --> SE COUNT: '863', Drop events count 0

>Are you using all the modules/services and do you require logging of each and every option ticked ?
True, some could be deactivated, but most are transferred to a central Elasticsearch cluster for log analysis.

0 Vivek Jagad over 1 year ago in reply to StefanS

So there are no drop events, as it has not exceeded the threshold.
Yes, I would request you to untick the options which are not required at least on the local reporting side !!

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel
0 ShrikantSophos over 1 year ago in reply to Vivek Jagad

StefanS The log "usercache_output: resolve_gr_cache for FW_PBR_MODULE failed" seems not normal . is it possible to give support access to device? Engineering team would like to check on what is causing that log to continuously . - Shrikant
Cancel
Vote Up 0 Vote Down

Cancel
0 StefanS over 1 year ago in reply to ShrikantSophos

Certainly no problem.
Where should i send the login data ?
Cancel
Vote Up 0 Vote Down

Cancel

Reply

0 StefanS over 1 year ago in reply to ShrikantSophos

Certainly no problem.
Where should i send the login data ?
Cancel
Vote Up 0 Vote Down

Cancel

Children

0 Vivek Jagad over 1 year ago in reply to StefanS

You can PM directly to me or ShrikantSophos
Support access id enablement - https://docs.sophos.com/nsg/sophos-firewall/19.0/Help/en-us/webhelp/onlinehelp/AdministratorHelp/Diagnostics/ConnectionList/SupportAccess/index.html
copy/paste in PM StefanS

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel
0 StefanS over 1 year ago in reply to Vivek Jagad

Vivek, you have a PM
Cancel
Vote Up 0 Vote Down

Cancel
0 Vivek Jagad over 1 year ago in reply to StefanS

Thank you, we will check and revert you soon !

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up 0 Vote Down

Cancel
+1 Vivek Jagad over 1 year ago in reply to Vivek Jagad

Hello StefanS

So here are the points of investigation:

1.) Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes

2.) About the high-cpu usage of garner post-upgrade - with the current enabled-logging, apparently this utilization is expected, So I hope you have disable the unnecessary logging on the local reporting as discussed previously.

*Note: these resolve-failure do not contribute much to garner-load as you suspected !

Workaround/Suggestions:

Convert migrated policy routes to sdwan-route and delete migrated policy routes

Thanks & Regards,
_______________________________________________________________

Vivek Jagad | Team Lead, Global Support & Services

Log a Support Case | Sophos Service Guide
Best Practices – Support Case

Sophos Community | Product Documentation | Sophos Techvids | SMS
If a post solves your question please use the 'Verify Answer' button.
Cancel
Vote Up +1 Vote Down

Cancel