XG managed APX 740 APs randomly going offline and dropping all clients

Feature and severity: I have a bug (appears to be) with SFOS v18 v3 and APX 740 wireless access points that I consider moderately impacting.

Summary: I am unsure of the trigger, however, every now and then (appears random but multiple times per day) all 3 of my APX 740s “appear” to go offline then come back a minute or two later.

Observed behavior: All 3 APs drop all clients and the SSID isn’t broadcast then after two or three minutes they come back.  All the clients need to re-home to the best AP again.  I run 3 740s using an XG210 as the controller and have 3 SSIDs (one for only 2.4ghz, one for only 5ghz and one for a guest SSID that’s 2.4 + 5ghz).  I thought it may be related to auto channel selection so I manually set the channel on both 5ghz and 2.4ghz radios on all APs (different channels of course).  The problem persisted though.  I don’t recall having this issue previously but it may have been happening without me being aware.  I say that because I’ve recently added significant home automation devices so now it’s very noticeable when this happens.

i tried Sophos central wireless and it’s worse.  I won’t go back to central until several releases come out.

Reproduce it:  This happens on its own many times a day but nothing that forces it that I’m aware of.

Supporting logs: The log viewer, under “SYSTEM” shows (just a brief excerpt for brevity):

SYSTEM
2020-01-05 14:13:20
WirelessProtection
   
[MASTER] sending notification about offline AP P210018WVRKDY75
18006
SYSTEM
2020-01-05 13:00:20
WirelessProtection
   
Successfully sent config to AP [P210018V7XJJH9F].
18007
SYSTEM
2020-01-05 13:00:05
WirelessProtection
   
Successfully sent config to AP [P210018WVRKDY75].
18007
SYSTEM
2020-01-05 12:59:48
WirelessProtection
   
[MASTER] sending notification about offline AP P210018V7XJJH9F
18006
SYSTEM
2020-01-05 12:59:27
WirelessProtection
   
[MASTER] sending notification about offline AP P210018WVRKDY75
18006
SYSTEM
2020-01-05 12:57:29
WirelessProtection
   
Successfully sent config to AP [P210018V7XJJH9F].
18007
SYSTEM
2020-01-05 12:56:49
WirelessProtection
   
[MASTER] sending notification about offline AP P210018V7XJJH9F
18006
SYSTEM
2020-01-05 12:56:05
WirelessProtection
   
Successfully sent config to AP [P210018WVRKDY75].
18007
SYSTEM
2020-01-05 12:55:25
WirelessProtection
   
[MASTER] sending notification about offline AP P210018WVRKDY75
18006

 

  • Check your DHCP server settings and logviewer for strange behaviour.

    Ian

  • In reply to rfcat_vk:

    The XG is my DHCP server. What are you suggesting I check?  It’s one subnet with a few manual reservations (I could manually reserve the APs but that seems unusual to me) and the rest of the settings are default.  And regarding log viewer , I thought what I saw was strange. So, I’m not sure I’d be able to identify what’s stranger so to speak.

    Only thing I think I changed was I enabled “conflict resolution” under DHCP.

    if you’re alluding to something on the client side - every single device that’s wirelessly connected drops and the SSIDs stop broadcasting so I don’t believe it’s that.

  • In reply to Dawg13:

    I am not alluding to anything, I am asking questions about your configuration that might/could be causing the issue.

    Next, have you deregistered the devices from the Wireless central so they are only managed by the XG?

    Ian

  • In reply to rfcat_vk:

    Yes.  I deregistered from central.  This happened prior to my trial with central candidly.  I was hoping central fixed it but it was a worse experience for several reasons.  

  • In reply to Dawg13:

    I should add that I rebooted the firewall two hours ago and haven’t seen any of the log messages but I’ve done this before as well so perhaps it’s an uptime gated thing.  I’ll post in the morning if there are any of the events logged.

  • In reply to Dawg13:

    No entries about APs going offline.  I’ll look again tonight and if nothing, daily, to ensure it’s not something that builds up over time.

  • In reply to Dawg13:

    No new offline entries as of tonight but I noticed this immediately following the reboot yesterday:

    SYSTEM
    2020-01-06 06:41:57
    Wireless Protection
       
    new firmware detected for APX740: 11.0.009-2
    17998
    SYSTEM
    2020-01-06 06:41:57
    Wireless Protection
       
    new firmware detected for APX530: 11.0.009-2
    17998

    The version showing under pattern update is 11.0.009 not 11.0.009-2 but I’m unsure if that’s cosmetic or not.  The pattern update page shows no new firmware available.  

    I‘ll check the logs daily and post if I see the behavior start again.

  • Hello Jamie,

    can you please specify wich fixed channels do you selected?

    Kind Regards,

    Suzzyx

  • In reply to suzzyx:

    Hi Suzzyx,

     

    for 2ghz (3 APs):

    1, 6, 11

     

    for 5ghz (2 APs):

    36, 44

     

    side note: I could re-enable auto selection (I only ever used auto on 5ghz) and see if it happens.  Still no new AP offline entries in the log.

  • In reply to Dawg13:

    Started to occur again.  I noticed memory use creeping up slowly.  I have an XG210 and at boot it starts just under 30%. Now it’s around 50%.  With auto turned off, no firmware updates nor any changes made by me, I’ve no idea what config is even being sent to the APs:

     

    SYSTEM
    2020-01-08 12:31:41
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
    SYSTEM
    2020-01-08 12:31:10
    Wireless Protection
       
    [MASTER] sending notification about offline AP P210018V7XJJH9F
    18006
    SYSTEM
    2020-01-08 11:59:51
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
    SYSTEM
    2020-01-08 11:59:19
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
    SYSTEM
    2020-01-08 11:58:46
    Wireless Protection
       
    [MASTER] sending notification about offline AP P210018V7XJJH9F
    18006
    SYSTEM
    2020-01-08 07:00:33
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
  • In reply to Dawg13:

    Again:

    SYSTEM
    2020-01-10 16:39:38
    Wireless Protection
       
    Successfully sent config to AP [P210018WVRKDY75].
    18007
    SYSTEM
    2020-01-10 16:39:05
    Wireless Protection
       
    [MASTER] sending notification about offline AP P210018WVRKDY75
    18006
    SYSTEM
    2020-01-08 12:31:41
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
    SYSTEM
    2020-01-08 12:31:10
    Wireless Protection
       
    [MASTER] sending notification about offline AP P210018V7XJJH9F
    18006
    SYSTEM
    2020-01-08 11:59:51
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
  • In reply to Dawg13:

    Memory use up to 53% now as well.  Maybe unrelated but worth mentioning with the unexpected reboot 3 days ago.

  • In reply to Dawg13:

    Could you show us a little network diagram with your APs? 

    Are you using VLANs or are the APs directly attached? 

  • In reply to LuCar Toni:

    Happened again:

    SYSTEM
    2020-01-12 04:05:18
    Wireless Protection
       
    Successfully sent config to AP [P210018WVRKDY75].
    18007
    SYSTEM
    2020-01-12 04:04:44
    Wireless Protection
       
    [MASTER] sending notification about offline AP P210018WVRKDY75
    18006
    SYSTEM
    2020-01-11 08:29:30
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
    SYSTEM
    2020-01-11 08:26:51
    Wireless Protection
       
    Successfully sent config to AP [P210018V7XJJH9F].
    18007
  • In reply to LuCar Toni:

    Yes I can.  It’s a small test network with roughly 50 devices, most wireless.  I’m not using VLANs although I have a test VLAN configured on all 48 switch ports (tagged) and a VLAN interface on the firewall but it’s not used (it is configured on the only LAN interface I have however).  The DHCP network for that VLAN is turned off as well.  Much of this was in place to test central wireless.  I’ll put a straw diagram together today and post.