XG 135 random reboots

Hi

 

I have a XG 135 running SFOS 17.5.8 MR-8 and it has very recently started to randomly auto reboot during the day. It has happened twice now on two separate occasions.

All I can see in the logs is that the device has started, and no reason given for the sudden reboot, and all statistics eg under Firewall Rules have reset to zero?

Anyone else experiencing this?

Louis

  • Hi  

    Sorry for the inconvenience caused!

    I would recommend the below steps.

    1. Check for system graphs and verify CPU and other resources usage for the approximate time of issue occurred.- https://community.sophos.com/kb/en-us/123186

    2. Please share the output of the below commands from Sophos SSH Console in the log file- https://community.sophos.com/kb/en-us/133678

                  console> system diagnostics show sysmsg

                  console> system diagnostics show syslog lines 1000

  • In reply to Keyur:

    Incident happened at just before 08:47in the morning.

      

    Graphs seem normal at the time except for a brief spike on boot which I think is normal

    I had a look at the console log output as suggested and with my limited understanding, I can see nothing specific at the time of incident. I would prefer not sharing them publicly as they contain sensitive ip addresses and info

  • In reply to Louis Swanepoel:

    We haven't seen reboots, but we've seen our 430 just completely lock up and not pass traffic. After it happening twice within a two week span Sophos support sent me a replacement unit. I'm not sure if it's actually a hardware or software issue, support didn't say.

  • In reply to Jbo:

    Yeah... mine locks up too for a minute or so where zero traffic will pass and then it hard reset/reboots itself

  • In reply to Louis Swanepoel:

    Hi  

    Thank you for sharing the details.

    The graphs seem to be normal, you can PM us the log file if you have captured.

    Please make sure that SSH and other device access from the WAN zone are disabled and proper IPS is applied if DNAT configuration is there in the firewall.

    To capture the logs for such instance in the furure we need to follow the below given steps but I hope it won't happen at all.

    1. The device needs to connect over the serial console access- https://community.sophos.com/kb/en-us/130693

    2. It will capture all the logs and allow us to analyze in such situation.

  • In reply to Keyur:

    Case Logged on ID:#9305406

  • In reply to Louis Swanepoel:

    Hi  

    Thank you for sharing case number, I will keep an eye on the service request number.

  • Louis,

    I would also suggest to check HW components (memory, disks, ethernet) by following this kb:

    https://community.sophos.com/kb/en-us/125025

    Regards

  • In reply to lferrara:

    We had it happen again today with our 430. I was searching in the firewall logs and it completely locked up. When I looked at the front panel of the device the drive activity was almost solid blue. Also, the front lcd panel does nothing. I couldn't cycle thru any of the options. So, I had to power cycle the device to get it back up.

     

    We are replacing the unit tomorrow during a maintenance window with the replacement Sophos sent us. Still not sure if it's hardware or software.

  • In reply to Jbo:

    I would investigate that by performing HW checks so at least you know if the issue was the HW and next time you can save time and number of reboots.

  • In reply to lferrara:

    This issue could be caused by an Bug, which should be resolved by changing the Hardware. 

    Maybe check: 

    console> system auto-reboot-on-stall show
    Auto reboot system on CPU stall is disabled

    This would lead to at least the hardware will reboot in case of freezing.  

  • In reply to LuCar Toni:

    Lucar,

    what do you mean "by a bug"?

    SW bug? So everyone that has this model, it is affected?

    Are you sure about that?

  • In reply to lferrara:

    I am not involved in the Bug follow up process, but i knew some customers with such an issue. They replaced their hardware and the issue was gone. 

    So i am not quite sure, what was going on there but i assume, its somehow a kernel panic. 

    But compared to the numbers of installations, i monitor (partners and customers), this is a small numbers of issues.