Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[garner] konstant 30% CPU, resolve Cache error

Hi there,
Sophos XG230 and v19.01.
We have here permanently 30% CPU from garner process.
Looking closer with "tail" you can see the following.

usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_GW_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed
usercache_output: resolve_gr_cache for FW_PBR_MODULE failed

I think that is not correct.
How to get this problem solved ?

thx

Stefan



This thread was automatically locked due to age.
Parents
  • Hello  ,

    Thank you for reaching out to the community, On the CLI, select option 5. Device Management, then option 3. Advanced Shell.  
    type the command: service garner:restart -ds nosync

    And then check the logs again !!

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hey   also the output of - less garner.log | grep threshold 
    Are you using all the modules/services and do you require logging of each and every option ticked ?

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • ess garner.log | grep threshold 

    less garner.log | grep threshold
    sethreshold_event: timestamp: '1668397800' --> SE COUNT: '43461', Drop events count 0
    sethreshold_event: timestamp: '1668397801' --> SE COUNT: '774', Drop events count 0
    sethreshold_event: timestamp: '1668397860' --> SE COUNT: '42811', Drop events count 0
    sethreshold_event: timestamp: '1668397920' --> SE COUNT: '42574', Drop events count 0
    sethreshold_event: timestamp: '1668397981' --> SE COUNT: '44601', Drop events count 0
    sethreshold_event: timestamp: '1668398040' --> SE COUNT: '42601', Drop events count 0
    sethreshold_event: timestamp: '1668398100' --> SE COUNT: '44057', Drop events count 0
    sethreshold_event: timestamp: '1668398101' --> SE COUNT: '659', Drop events count 0
    sethreshold_event: timestamp: '1668398161' --> SE COUNT: '42348', Drop events count 0
    sethreshold_event: timestamp: '1668398221' --> SE COUNT: '43394', Drop events count 0
    sethreshold_event: timestamp: '1668398281' --> SE COUNT: '43059', Drop events count 0
    sethreshold_event: timestamp: '1668398340' --> SE COUNT: '42813', Drop events count 0
    sethreshold_event: timestamp: '1668398400' --> SE COUNT: '44245', Drop events count 0
    sethreshold_event: timestamp: '1668398401' --> SE COUNT: '724', Drop events count 0
    sethreshold_event: timestamp: '1668398461' --> SE COUNT: '42706', Drop events count 0
    sethreshold_event: timestamp: '1668398521' --> SE COUNT: '42717', Drop events count 0
    sethreshold_event: timestamp: '1668398581' --> SE COUNT: '43595', Drop events count 0
    sethreshold_event: timestamp: '1668398641' --> SE COUNT: '43426', Drop events count 0
    sethreshold_event: timestamp: '1668398700' --> SE COUNT: '43507', Drop events count 0
    sethreshold_event: timestamp: '1668398701' --> SE COUNT: '360', Drop events count 0
    sethreshold_event: timestamp: '1668398760' --> SE COUNT: '42634', Drop events count 0
    sethreshold_event: timestamp: '1668398820' --> SE COUNT: '43458', Drop events count 0
    sethreshold_event: timestamp: '1668398881' --> SE COUNT: '43343', Drop events count 0
    sethreshold_event: timestamp: '1668398941' --> SE COUNT: '44051', Drop events count 0
    sethreshold_event: timestamp: '1668399001' --> SE COUNT: '44058', Drop events count 0
    sethreshold_event: timestamp: '1668399002' --> SE COUNT: '429', Drop events count 0
    sethreshold_event: timestamp: '1668399060' --> SE COUNT: '42286', Drop events count 0
    sethreshold_event: timestamp: '1668399120' --> SE COUNT: '42492', Drop events count 0
    sethreshold_event: timestamp: '1668399181' --> SE COUNT: '43077', Drop events count 0
    sethreshold_event: timestamp: '1668399241' --> SE COUNT: '43979', Drop events count 0
    sethreshold_event: timestamp: '1668399300' --> SE COUNT: '43729', Drop events count 0
    sethreshold_event: timestamp: '1668399301' --> SE COUNT: '548', Drop events count 0
    sethreshold_event: timestamp: '1668399360' --> SE COUNT: '43171', Drop events count 0
    sethreshold_event: timestamp: '1668399420' --> SE COUNT: '42865', Drop events count 0
    sethreshold_event: timestamp: '1668399480' --> SE COUNT: '42672', Drop events count 0
    sethreshold_event: timestamp: '1668399541' --> SE COUNT: '44439', Drop events count 0
    sethreshold_event: timestamp: '1668399601' --> SE COUNT: '44283', Drop events count 0
    sethreshold_event: timestamp: '1668399601' --> SE COUNT: '38', Drop events count 0
    sethreshold_event: timestamp: '1668399660' --> SE COUNT: '42354', Drop events count 0
    sethreshold_event: timestamp: '1668399720' --> SE COUNT: '42993', Drop events count 0
    sethreshold_event: timestamp: '1668399780' --> SE COUNT: '43841', Drop events count 0
    sethreshold_event: timestamp: '1668399840' --> SE COUNT: '43175', Drop events count 0
    sethreshold_event: timestamp: '1668399901' --> SE COUNT: '43590', Drop events count 0
    sethreshold_event: timestamp: '1668399902' --> SE COUNT: '1042', Drop events count 0
    sethreshold_event: timestamp: '1668399962' --> SE COUNT: '43042', Drop events count 0
    sethreshold_event: timestamp: '1668400021' --> SE COUNT: '42922', Drop events count 0
    sethreshold_event: timestamp: '1668400080' --> SE COUNT: '42854', Drop events count 0
    sethreshold_event: timestamp: '1668400140' --> SE COUNT: '43417', Drop events count 0
    sethreshold_event: timestamp: '1668400201' --> SE COUNT: '43764', Drop events count 0
    sethreshold_event: timestamp: '1668400202' --> SE COUNT: '793', Drop events count 0
    sethreshold_event: timestamp: '1668400261' --> SE COUNT: '42428', Drop events count 0
    sethreshold_event: timestamp: '1668400321' --> SE COUNT: '43708', Drop events count 0
    sethreshold_event: timestamp: '1668400381' --> SE COUNT: '44033', Drop events count 0
    sethreshold_event: timestamp: '1668400440' --> SE COUNT: '42492', Drop events count 0
    sethreshold_event: timestamp: '1668400500' --> SE COUNT: '43310', Drop events count 0
    sethreshold_event: timestamp: '1668400501' --> SE COUNT: '1050', Drop events count 0
    sethreshold_event: timestamp: '1668400561' --> SE COUNT: '42383', Drop events count 0
    sethreshold_event: timestamp: '1668400621' --> SE COUNT: '43137', Drop events count 0
    sethreshold_event: timestamp: '1668400681' --> SE COUNT: '42849', Drop events count 0
    sethreshold_event: timestamp: '1668400741' --> SE COUNT: '44394', Drop events count 0
    sethreshold_event: timestamp: '1668400800' --> SE COUNT: '42627', Drop events count 0
    sethreshold_event: timestamp: '1668400801' --> SE COUNT: '863', Drop events count 0
    


    >Are you using all the modules/services and do you require logging of each and every option ticked ?
    True, some could be deactivated, but most are transferred to a central Elasticsearch cluster for log analysis.

  • So there are no drop events, as it has not exceeded the threshold. 
    Yes, I would request you to untick the options which are not required at least on the local reporting side !!

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  •   The log "usercache_output: resolve_gr_cache for FW_PBR_MODULE failed" seems not normal . is it possible to give support access to device? Engineering team would like to check on what is causing that log to continuously . - Shrikant

  • Certainly no problem.
    Where should i send the login data ?

  • Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Thank you, we will check and revert you soon !

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hello  

    So here are the points of investigation: 

    1.)  Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes

    2.) About the high-cpu usage of garner post-upgrade - with the current enabled-logging, apparently this utilization is expected, So I hope you have disable the unnecessary logging on the local reporting as discussed previously. 

    *Note: these resolve-failure do not contribute much to garner-load as you suspected !

    Workaround/Suggestions:

    Convert migrated policy routes to sdwan-route and delete migrated policy routes

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

Reply
  • Hello  

    So here are the points of investigation: 

    1.)  Logs appearing in garner.log are related to the *resolver-failure for such migrated-policyroutes

    2.) About the high-cpu usage of garner post-upgrade - with the current enabled-logging, apparently this utilization is expected, So I hope you have disable the unnecessary logging on the local reporting as discussed previously. 

    *Note: these resolve-failure do not contribute much to garner-load as you suspected !

    Workaround/Suggestions:

    Convert migrated policy routes to sdwan-route and delete migrated policy routes

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

Children
No Data