Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

OTP Issues with several users

Hello,

sice some days we have the problem that with some users (will be more and more) OTP auth is failing:

-> oath_totp_validate() failed for tokenid xxxxxxxxxxxxxxxxxxxxxx with error The OTP is not valid

- OTP was working fine all the time before issues
- the users and the firewall have correct time and date
- when checking time offset the offset is too big to displayed or extreme high
- the problem exist with old Sophos Authenticator app and Microsoft Authenticator on iOS 17.6
- the problem seems not to depend on App or Phone (other OTP codes are working on the same phone/app)

When delete and autocreate new tokens for the users it will work again.

Anybody has an idea what's going on? Bug?
Is there any reason (a working) OTP is not working anymore someday?



Edited TAGs
[edited by: Erick Jan at 10:13 AM (GMT -7) on 5 Aug 2024]
Parents
  • We will open a KIL soon. 

    To summarize the issue: 
    It has no relationship to an Firmware Update. 
    Instead the issue is caused by the HA Cluster and the TakeOver. 
    In certain situations, the OTP Token can be corrupt. Which means, it is valid on the current primary, but invalid on the AUX. If you perform a TakeOver to the AUX, it will be the corrupt Token in place and it will not work anymore for this particular user. 

    We will fix the issue in V20.0 MR3 or V21.0 MR1 to NOT corrupt the token anymore, but we cant repair the tokens per se, if you already did a takeover.

    What you can do to fix this issue before doing a takeover or a firmware update: 
    The option "Sync Auxiliary Device" will repair the HA Cluster. 

    Use this option to reboot the HA AUX Appliance (No downtime required) and it will repair the sync. 

    __________________________________________________________________________________________________________________

  • Thanks for technical info and confirmation of the bug.

    What is causing the corruption?

    Is there a chance that some MFA on the aux node even corrupt in the short time between the sync and the HA-takeover?

Reply Children