User Authenticated in Current Activities but not being used in Firewall/Web

Hi all,

I have run into an issue wherein users are logging in via STAS and I can see them in the Current Activities screen but then they are being blocked because they are showing in the Log Viewer as unauthenticated. It's a really weird disconnect I've never seen before, authenticated but their auth seems to not be being passed to the other services or it's not being referenced.

I have rebuilt the node as originally it was a v17.1.3 upgraded so in case that was an issue I wiped and installed direct to v17.4.0-Beta2.

I have double and triple checked all auth configuration and is a pretty standard, almost basic, setup. 2 DCs, SSO Suite in the same group on STAS on XG, services are configured correectly, bobs your uncle.

If someone has seen this before or if this is a known issue in Jira for the next Beta/GA release could let me know? That'd be fab!

I haven't had a chance to trawl through the intense amount of logs yet for the access_server and awarrenhttp.

Emile

Parents
  • What does the output of "ipset -L lusers" show for the source address ? when a packet comes into the LAN, the firewall first looks to see if it knows any "stas" status on it, if it does not it sends of a query to the collector, the collector will try and wmi query or registry query the client, and based on the results of the query from the collector to the client it will pass back the results to the firewall. During the query time its a bit of a grace period where traffic is allowed until the query times out or the collector responds.

    the output of "ipset -L lusers" has 4 states: Learning, LearningRetry1, Unauthenticated. the other state is authenticated but it is indicated by the users numeric "id" being appended to the address within the output.

    Learning: the first state an unknown packet has arrived

    LearningRetry1: second attempt at correlating a user to ip

    Unauthenticated: the Learning and LearningRetry1 phase unsuccessful, source address traffic dropped for time period defined by "unauth-traffic" value. 

    It would be interesting to see what stas state your test system is in when you see the problem.

     

    Tom

  • Could you also double check the registered AD servers, make sure the search queries are correct. You can also do a tail on the access_server.log it may shed light on the issue. 

    >tail -f /log/access_server.log

     

    Regards,

    Tom 

  • Hi Tom,

    I've looked through the access_server logs in debug and everything looks all ok, I can see them going through teh phase 1 & 2 auth rigamarole then being entered in the database. What I cannot see is them being removed from the ipset database. Is there a specific log file for ipset actions?

    In case it was the GUI messing it up, I did the following:

    1. CLIed to XG for Console
    2. system auth cta disable
    3. restarted access_server
    4. stopped Collector
    5. wiped db3
    6. system auth cta enable
    7. system auth cta collector add (collector details)
    8. did not modify the unauth traffic this time,
    9. started collector

    I started seeing users authenticated both on Current Activities Live Users and in ipset. Thought I was on a good track, checked other random users, some were showing in ipset that their IP is "unauthenticated".

    The next thing I can think of doing is to wipe all the users in case it's the users database at fault.

    Emile

  • I found a user that was having the issue and it looks like to be something to do with stale data during the STAS and the learning process. I locked and logged in and we could see my auth process went through on the XG and access server fine but when doing a watch (every 2s) on the ipset, this is what happened:

    172.24.202.100,2218,71
    172.24.202.100,2218,71
    172.24.202.100,2218,71
    172.24.202.100,2218,71
    172.24.202.100,2218,71
    172.24.202.100,0,0,RetryLearning1
    172.24.202.100,0,0,RetryLearning1
    172.24.202.100,0,0,RetryLearning1
    172.24.202.100,0,0,RetryLearning1
    172.24.202.100,0,0,RetryLearning1
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,0,0,Unauthenticated
    172.24.202.100,2282,71
    172.24.202.100,2282,71
    172.24.202.100,2282,71

     

    At the point it is saying retry learning was when I was logged into the captive portal, signed out and re-logged in (which all was green across the board in the logs. But as you can see the learning fails and sits at unauth until you see me delete the user both from the live users in XG GUI and in the Collector live users list then log back in.

    While writing this, I have just realised that I did not update the unauth timeout...

    Emile

  • Morning Emile,

     since stas has a few dependency aspects could you try the following from thw domain controllers to your clients. 1. open command prompt and attempt to perform a wmi query to emulate the stas steps.

    The /user command set the account the same as you run the stas service as.

     

    C:\WINDOWS>wmic
    wmic:root\cli>/user: DOMAIN\administrator
    Enter the password :********

    wmic:root\cli>/node: 192.168.1.10
    wmic:root\cli>computersystem get username /value

    UserName=DOMAIN\testuser
    wmic:root\cli

    stas steps would be:

    packet arrives at firewall

    firewall sends a query to the collectors on port 6677 (know anything about this ip?)

    the collector will wmi or remote registry query the endpoint to determine if someone is logged on.

    stas database is updated

    performing the wmi query manually helps to very step 3.

    also after an account has been verified stas relies on the ad servers & query strings to determine user group memberships.  so its possible that stas is able to do the wmi query and answer back to the firewall that the system has a live user but... something in regards to doing the ldap lookup is having a problem. You could also check the ad server event logs for evidence of such.

     

    hope this helps,

    Tom

     

     

  • Hi Tom,

    Forgot to mention that there are Macs in this environment so therefore Logoff detection was set to ping. But then we found a large number of devices not responding to ping so we have turned off logoff detection altogether. (FW rule being deployed to resolve ping issues so logoff detection can be re-enabled)

    I appreciate the explanation on STAS as you never know, there may be something missed. However it hasn't really changed much since Copernicus and the issue I was tracking was on top of standard STAS functionality behind the scenes between auth server and ipset.

    AD Server events and Access Server events match up and both log groups appear happy, in sync and with no errors.

    Looking further into it, it might be to do with the new functionality of the unauth traffic system.

    Emile

Reply
  • Hi Tom,

    Forgot to mention that there are Macs in this environment so therefore Logoff detection was set to ping. But then we found a large number of devices not responding to ping so we have turned off logoff detection altogether. (FW rule being deployed to resolve ping issues so logoff detection can be re-enabled)

    I appreciate the explanation on STAS as you never know, there may be something missed. However it hasn't really changed much since Copernicus and the issue I was tracking was on top of standard STAS functionality behind the scenes between auth server and ipset.

    AD Server events and Access Server events match up and both log groups appear happy, in sync and with no errors.

    Looking further into it, it might be to do with the new functionality of the unauth traffic system.

    Emile

Children
No Data