Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

Reboot taking 40 minutes on XGS 3100

The last couple of times I have had to reboot our XGS 3100 it has taken a very long time to become active again.

I paid more attention this last time and the hard disk LED is almost constantly lit. There are very brief periods when it goes out, starts flickering and then goes solid again. Took 30 minutes for network interfaces to come up and a further 10 minutes before all connectivity was restored.

The last two reboots were to restore full functionality to our site-to-site IPSEC tunnels but other than that the XGS was working fine.  

Hard disk space utilisation:
configuration 10%
content 4%
report 22%

Running SFOS 21.0.0 GA-Build169

Any suggestions?



Added TAGs
[edited by: Raphael Alganes at 10:25 AM (GMT -8) on 14 Jan 2025]
Parents
  • You could put a Serial Console on it and dump the session for next time (or to reproduce it).

    https://support.sophos.com/support/s/article/KBA-000003810?language=en_US

    MobaxTerm gives option to record the entire session. I have even a Raspberry PI, which dumps all console output (from USB) to a log file (very handy for this subject in general, not only for Sophos). 

    __________________________________________________________________________________________________________________

  • I can do this but it will have to wait until I have swapped it out of production. The I/O errors from the log don't look good and I can't afford to be offline 40 minutes every time I try something. Access to everything goes through the XGS, not just the WAN.

    I like the idea of the raspberry PI but I don't have enough knowledge to set one up easily and it wouldn't be worth the time investment to learn how for how often I would use it. I'll just have to make do with a boring old laptop. Sounds neat though.

  • The first step would be to know, what the appliances does, when it boots. 
    It sounds to me like extensive checkups. Because your log will indicate when "Linux Kernel" is loaded, and if your time is boot+40 minutes in the logs, that indicates, the system is not loading Linux and doing checkups all the time.

    The serial session will give more insight, what the appliance is doing.

    You do not have a HA Cluster? Because it would be interesting to know, if both appliances have the same issue. 

    __________________________________________________________________________________________________________________

Reply
  • The first step would be to know, what the appliances does, when it boots. 
    It sounds to me like extensive checkups. Because your log will indicate when "Linux Kernel" is loaded, and if your time is boot+40 minutes in the logs, that indicates, the system is not loading Linux and doing checkups all the time.

    The serial session will give more insight, what the appliance is doing.

    You do not have a HA Cluster? Because it would be interesting to know, if both appliances have the same issue. 

    __________________________________________________________________________________________________________________

Children
No Data