Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

cannot login to UI after a while

I am running XG at home (v21) and after a few days I am unable to login to the firewall via any method.  SSH, web UI, and local console.  The login simply fails.  I know the password is correct since it does accept it when things are working. 

During the time I am unable to login the firewall is passing traffic and I have Internet access.  However, SMTP messages from the XG or via XG mail relay are not working either.

I can tell when the issue starts because I get the captcha on the web admin login screen when I have disabled it for the web admin.

The only solution is to restart the hardware (hard reset) and let the firewall reboot. 

Is there anyway to find out what is causing this issue?  I do not see anything in the system logs that would point to a problem.  There does not seem to be a specific time interval at which this happens.  It is usually 3-5 days.

Aside from restarting the system regularly I am not sure what other options I have other than to find another solution.  I have been using the XG for several years and this problem started back on v17 or 18 if I recall.

Any help would be appreciated, thank you.



Edited TAGs
[edited by: Erick Jan at 12:07 AM (GMT -8) on 11 Dec 2024]
  • Hi,

    did you restart the XG?

    Ian

    XG115W - v20.0.3 MR-3 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.

  • I did clean of some linked NAT rules that were migrated from old versions.  And refined existing rules.  Also deleted anything I was not using.  If the rules run top down I am not sure how a loop can occur.  Regardless, the cleanup didn't hurt.  Zombie processes are still around 24k in top and 2300 showing in ATOP.

    So, I guess that did not have an impact, yet.

  • Thanks for the info.  I was hoping a fix was on the horizon.  It would be nice if it applied to my issue too.

  • I had an xgs3100 acting similar.  Not full lock out, but the web console was slowly becoming unusable.   Reboot fixed

    Support claims an emergency hot patch was pushed out the other day which may deal with this.  Hopefully resolves your issue as well

  • Hi,

    I would suggest you review all your firewall and NAT rules before rebuilding to ensure you don't have any loops.

    Ian

    XG115W - v20.0.3 MR-3 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.

  • What do the zombie processes represent?  Anyway to track the issue down?  If I was to rebuild and restore from backup would I be bringing the problem with me?

  • Hi,

    there appears to be something wrong with your configuration with the amount of zombie processes.

    I reviewed my XG115w and it has 0 zombies.

    Ian

    XG115W - v20.0.3 MR-3 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.

  • physical machine.  only 30% disk usage

    atop:

    ATOP - localhost    2024/12/13  12:09:42    ----x---------     3d0h19m1s elapsed
    PRC | sys   87m46s | user  99m45s | #proc  22447 | #zombie 22e3 | no  procacct |
    CPU | sys       3% | user      5% | irq       0% | idle    391% | wait      0% |
    CPL | avg1    0.25 | avg5    0.29 | avg15   0.18 | csw 138160e4 | intr 77902e4 |
    MEM | tot     5.8G | free  265.8M | cache   1.4G | buff  181.1M | slab  602.7M |
    SWP | tot     5.8G | free    5.7G |              | vmcom  11.6G | vmlim   8.7G |
    PAG | scan 2996949 | steal 2876e3 | stall      0 | swin    1705 | swout  35808 |
    LVM |    mountconf | busy      0% | read    7281 | write 367207 | avio 0.51 ms |
    LVM |        nvram | busy      0% | read    7856 | write     41 | avio 1.84 ms |
    LVM |    cryptroot | busy      0% | read  424594 | write      0 | avio 0.01 ms |
    LVM |       fwmgmt | busy      0% | read      14 | write      0 | avio 3.43 ms |
    DSK |          sda | busy      0% | read  423489 | write 2003e3 | avio 0.49 ms |
    NET | transport    | tcpi 5665674 | tcpo 5088625 | udpi 1461964 | udpo 2292282 |
    NET | network      | ipi  1505712 | ipo 10274467 | ipfrw 2808e3 | deliv 7382e3 |
    NET | Port2     0% | pcki 24335e3 | pcko 18823e3 | si  536 Kbps | so  425 Kbps |
    NET | Port1     0% | pcki 20276e3 | pcko 23503e3 | si  437 Kbps | so  496 Kbps |
      PID SYSCPU USRCPU  VGROW  RGROW  RDDSK  WRDSK  THR S CPUNR  CPU CMD     1/3742
                                                                                    
     3212 41m59s  0.00s     0K     0K     0K     0K    1 S     1   1% vfp_mflow_time
    22580  3m14s  6m15s   2.1G 855.4M  1465K    24K    7 S     2   0% snort        

    top:

    top - 12:47:29 up 3 days, 56 min,  1 user,  load average: 0.25, 0.21, 0.19      
    Tasks: 22693 total,   1 running, 457 sleeping,   0 stopped, 22235 zombie        
    %Cpu(s):  2.6 us,  3.3 sy,  0.0 ni, 94.1 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
    MiB Mem :   5948.0 total,    301.0 free,   3832.2 used,   1814.8 buff/cache     
    MiB Swap:   5944.3 total,   5799.6 free,    144.8 used.   1512.0 avail Mem      
                                                                                    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND    
    27893 root      20   0   43936  28196   2064 R   7.1   0.5   0:00.68 top        
     3212 root      20   0       0      0      0 S   0.9   0.0  42:20.47 vfp_mflow+
    22580 root      20   0 2204840 875892  29600 S   0.6  14.4   9:40.24 snort      
    22581 root      20   0 2204840 875988  29636 S   0.6  14.4   9:43.19 snort      
       28 root      20   0       0      0      0 S   0.3   0.0   0:10.24 ksoftirqd+
     2071 nobody    20   0   79776  14632  14288 S   0.3   0.2   0:26.36 postgres   
     4733 root      20   0 3000288 347692  14556 S   0.3   5.7   4:50.27 java       
     4734 root      20   0  245216   4084   3724 S   0.3   0.1   6:37.42 appcached  
     4762 nobody    20   0   61908   3096   1676 S   0.3   0.1   2:44.77 redis-ser+
     4782 nobody    20   0   61908   3208   1960 S   0.3   0.1   2:43.32 redis-ser+
     6615 root      20   0   23632   4520   2692 S   0.3   0.1   1:02.80 fqdnd      
     8095 nobody    20   0 1999572  13492   4368 S   0.3   0.2   0:02.88 httpd      
    22578 root      20   0 2204840 875900  29620 S   0.3  14.4  12:55.02 snort      
    22579 root      20   0 2204840 876172  29812 S   0.3  14.4  10:19.03 snort      
        1 root      20   0   22936   2084   2064 S   0.0   0.0   0:01.44 init       
        2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kthreadd   
        3 root      20   0       0      0      0 I   0.0   0.0   0:00.00 kworker/0+
  • This is physical machine.  The disk space is about 30% used.

    TOP:

    top - 12:08:53 up 3 days, 18 min,  1 user,  load average: 0.44, 0.33, 0.19      
    Tasks: 22442 total,   1 running, 456 sleeping,   0 stopped, 21985 zombie        
    %Cpu(s):  9.5 us,  3.7 sy,  0.0 ni, 86.6 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st
    MiB Mem :   5948.0 total,    275.3 free,   3836.7 used,   1836.0 buff/cache     
    MiB Swap:   5944.3 total,   5807.3 free,    137.0 used.   1521.1 avail Mem      
                                                                                    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND    
     7353 root      20   0   43948  28004   2160 R   7.7   0.5   0:01.95 top        
     1882 root      20   0   29220  13316   5320 S   4.6   0.2   4:50.76 garner     
     2819 root      20   0   97352  61424   8072 S   1.2   1.0   9:27.17 garner     
     3212 root      20   0       0      0      0 S   0.9   0.0  41:58.05 vfp_mflow+

    ATOP:

    ATOP - localhost    2024/12/13  12:09:42    ----x---------     3d0h19m1s elapsed
    PRC | sys   87m46s | user  99m45s | #proc  22447 | #zombie 22e3 | no  procacct |
    CPU | sys       3% | user      5% | irq       0% | idle    391% | wait      0% |
    CPL | avg1    0.25 | avg5    0.29 | avg15   0.18 | csw 138160e4 | intr 77902e4 |
    MEM | tot     5.8G | free  265.8M | cache   1.4G | buff  181.1M | slab  602.7M |
    SWP | tot     5.8G | free    5.7G |              | vmcom  11.6G | vmlim   8.7G |
    PAG | scan 2996949 | steal 2876e3 | stall      0 | swin    1705 | swout  35808 |
    LVM |    mountconf | busy      0% | read    7281 | write 367207 | avio 0.51 ms |
    LVM |        nvram | busy      0% | read    7856 | write     41 | avio 1.84 ms |
    LVM |    cryptroot | busy      0% | read  424594 | write      0 | avio 0.01 ms |
    LVM |       fwmgmt | busy      0% | read      14 | write      0 | avio 3.43 ms |
    DSK |          sda | busy      0% | read  423489 | write 2003e3 | avio 0.49 ms |
    NET | transport    | tcpi 5665674 | tcpo 5088625 | udpi 1461964 | udpo 2292282 |
    NET | network      | ipi  1505712 | ipo 10274467 | ipfrw 2808e3 | deliv 7382e3 |
    NET | Port2     0% | pcki 24335e3 | pcko 18823e3 | si  536 Kbps | so  425 Kbps |
    NET | Port1     0% | pcki 20276e3 | pcko 23503e3 | si  437 Kbps | so  496 Kbps |
      PID SYSCPU USRCPU  VGROW  RGROW  RDDSK  WRDSK  THR S CPUNR  CPU CMD     1/3742
                                                                                    
     3212 41m59s  0.00s     0K     0K     0K     0K    1 S     1   1% vfp_mflow_time
    22580  3m14s  6m15s   2.1G 855.4M  1465K    24K    7 S     2   0% snort        

  •  What is the disk capacity?

    is this physical or vnm machine?
    ian

    XG115W - v20.0.3 MR-3 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.