[garner] konstant 30% CPU, resolve Cache error

Hi there,
Sophos XG230 and v19.01.
We have here permanently 30% CPU from garner process.
Looking closer with "tail" you can see the following.

usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_GW_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed

I think that is not correct.
How to get this problem solved ?

thx

Stefan



Edited TAGs
[edited by: emmosophos at 10:41 PM (GMT -8) on 14 Nov 2022]
Parents
  • Hello  ,

    Thank you for reaching out to the community, On the CLI, select option 5. Device Management, then option 3. Advanced Shell.  
    type the command: service garner:restart -ds nosync

    And then check the logs again !!

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hey   also the output of - less garner.log | grep threshold 
    Are you using all the modules/services and do you require logging of each and every option ticked ?

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • ess garner.log | grep threshold 

    less garner.log | grep threshold
    sethreshold_event: timestamp: '1668397800' --> SE COUNT: '43461', Drop events count 0
    sethreshold_event: timestamp: '1668397801' --> SE COUNT: '774', Drop events count 0
    sethreshold_event: timestamp: '1668397860' --> SE COUNT: '42811', Drop events count 0
    sethreshold_event: timestamp: '1668397920' --> SE COUNT: '42574', Drop events count 0
    sethreshold_event: timestamp: '1668397981' --> SE COUNT: '44601', Drop events count 0
    sethreshold_event: timestamp: '1668398040' --> SE COUNT: '42601', Drop events count 0
    sethreshold_event: timestamp: '1668398100' --> SE COUNT: '44057', Drop events count 0
    sethreshold_event: timestamp: '1668398101' --> SE COUNT: '659', Drop events count 0
    sethreshold_event: timestamp: '1668398161' --> SE COUNT: '42348', Drop events count 0
    sethreshold_event: timestamp: '1668398221' --> SE COUNT: '43394', Drop events count 0
    sethreshold_event: timestamp: '1668398281' --> SE COUNT: '43059', Drop events count 0
    sethreshold_event: timestamp: '1668398340' --> SE COUNT: '42813', Drop events count 0
    sethreshold_event: timestamp: '1668398400' --> SE COUNT: '44245', Drop events count 0
    sethreshold_event: timestamp: '1668398401' --> SE COUNT: '724', Drop events count 0
    sethreshold_event: timestamp: '1668398461' --> SE COUNT: '42706', Drop events count 0
    sethreshold_event: timestamp: '1668398521' --> SE COUNT: '42717', Drop events count 0
    sethreshold_event: timestamp: '1668398581' --> SE COUNT: '43595', Drop events count 0
    sethreshold_event: timestamp: '1668398641' --> SE COUNT: '43426', Drop events count 0
    sethreshold_event: timestamp: '1668398700' --> SE COUNT: '43507', Drop events count 0
    sethreshold_event: timestamp: '1668398701' --> SE COUNT: '360', Drop events count 0
    sethreshold_event: timestamp: '1668398760' --> SE COUNT: '42634', Drop events count 0
    sethreshold_event: timestamp: '1668398820' --> SE COUNT: '43458', Drop events count 0
    sethreshold_event: timestamp: '1668398881' --> SE COUNT: '43343', Drop events count 0
    sethreshold_event: timestamp: '1668398941' --> SE COUNT: '44051', Drop events count 0
    sethreshold_event: timestamp: '1668399001' --> SE COUNT: '44058', Drop events count 0
    sethreshold_event: timestamp: '1668399002' --> SE COUNT: '429', Drop events count 0
    sethreshold_event: timestamp: '1668399060' --> SE COUNT: '42286', Drop events count 0
    sethreshold_event: timestamp: '1668399120' --> SE COUNT: '42492', Drop events count 0
    sethreshold_event: timestamp: '1668399181' --> SE COUNT: '43077', Drop events count 0
    sethreshold_event: timestamp: '1668399241' --> SE COUNT: '43979', Drop events count 0
    sethreshold_event: timestamp: '1668399300' --> SE COUNT: '43729', Drop events count 0
    sethreshold_event: timestamp: '1668399301' --> SE COUNT: '548', Drop events count 0
    sethreshold_event: timestamp: '1668399360' --> SE COUNT: '43171', Drop events count 0
    sethreshold_event: timestamp: '1668399420' --> SE COUNT: '42865', Drop events count 0
    sethreshold_event: timestamp: '1668399480' --> SE COUNT: '42672', Drop events count 0
    sethreshold_event: timestamp: '1668399541' --> SE COUNT: '44439', Drop events count 0
    sethreshold_event: timestamp: '1668399601' --> SE COUNT: '44283', Drop events count 0
    sethreshold_event: timestamp: '1668399601' --> SE COUNT: '38', Drop events count 0
    sethreshold_event: timestamp: '1668399660' --> SE COUNT: '42354', Drop events count 0
    sethreshold_event: timestamp: '1668399720' --> SE COUNT: '42993', Drop events count 0
    sethreshold_event: timestamp: '1668399780' --> SE COUNT: '43841', Drop events count 0
    sethreshold_event: timestamp: '1668399840' --> SE COUNT: '43175', Drop events count 0
    sethreshold_event: timestamp: '1668399901' --> SE COUNT: '43590', Drop events count 0
    sethreshold_event: timestamp: '1668399902' --> SE COUNT: '1042', Drop events count 0
    sethreshold_event: timestamp: '1668399962' --> SE COUNT: '43042', Drop events count 0
    sethreshold_event: timestamp: '1668400021' --> SE COUNT: '42922', Drop events count 0
    sethreshold_event: timestamp: '1668400080' --> SE COUNT: '42854', Drop events count 0
    sethreshold_event: timestamp: '1668400140' --> SE COUNT: '43417', Drop events count 0
    sethreshold_event: timestamp: '1668400201' --> SE COUNT: '43764', Drop events count 0
    sethreshold_event: timestamp: '1668400202' --> SE COUNT: '793', Drop events count 0
    sethreshold_event: timestamp: '1668400261' --> SE COUNT: '42428', Drop events count 0
    sethreshold_event: timestamp: '1668400321' --> SE COUNT: '43708', Drop events count 0
    sethreshold_event: timestamp: '1668400381' --> SE COUNT: '44033', Drop events count 0
    sethreshold_event: timestamp: '1668400440' --> SE COUNT: '42492', Drop events count 0
    sethreshold_event: timestamp: '1668400500' --> SE COUNT: '43310', Drop events count 0
    sethreshold_event: timestamp: '1668400501' --> SE COUNT: '1050', Drop events count 0
    sethreshold_event: timestamp: '1668400561' --> SE COUNT: '42383', Drop events count 0
    sethreshold_event: timestamp: '1668400621' --> SE COUNT: '43137', Drop events count 0
    sethreshold_event: timestamp: '1668400681' --> SE COUNT: '42849', Drop events count 0
    sethreshold_event: timestamp: '1668400741' --> SE COUNT: '44394', Drop events count 0
    sethreshold_event: timestamp: '1668400800' --> SE COUNT: '42627', Drop events count 0
    sethreshold_event: timestamp: '1668400801' --> SE COUNT: '863', Drop events count 0
    


    >Are you using all the modules/services and do you require logging of each and every option ticked ?
    True, some could be deactivated, but most are transferred to a central Elasticsearch cluster for log analysis.

  • So there are no drop events, as it has not exceeded the threshold. 
    Yes, I would request you to untick the options which are not required at least on the local reporting side !!

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  •   The log "usercache_output: resolve_gr_cache for FW_PBR_MODULE failed" seems not normal . is it possible to give support access to device? Engineering team would like to check on what is causing that log to continuously . - Shrikant

  • Certainly no problem.
    Where should i send the login data ?

  • Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Thank you, we will check and revert you soon !

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hello  

    So here are the points of investigation: 

    1.)  Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes

    2.) About the high-cpu usage of garner post-upgrade - with the current enabled-logging, apparently this utilization is expected, So I hope you have disable the unnecessary logging on the local reporting as discussed previously. 

    *Note: these resolve-failure do not contribute much to garner-load as you suspected !

    Workaround/Suggestions:

    Convert migrated policy routes to sdwan-route and delete migrated policy routes

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

Reply
  • Hello  

    So here are the points of investigation: 

    1.)  Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes

    2.) About the high-cpu usage of garner post-upgrade - with the current enabled-logging, apparently this utilization is expected, So I hope you have disable the unnecessary logging on the local reporting as discussed previously. 

    *Note: these resolve-failure do not contribute much to garner-load as you suspected !

    Workaround/Suggestions:

    Convert migrated policy routes to sdwan-route and delete migrated policy routes

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Technical Account Manager 3 | Cyber Security Evolved


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

Children
No Data