UTM <--> RED Failover / Backup Problem: Is this a known problem and how do you do it? Link down = ok. Link up but ISP down = malfunction

Wir haben durch einen externen Dienstleister einen Fehler in der UTM VPN Funktion entdeckt, der die Failover Funktion betrifft.

Firmwareversion: 9.703-3

Test 1. Es soll geprüft werden, ob die RED die Verbindung zur UTM automatisch über die Backupleitung (WAN2) wiederherstellt, wenn der Link der Hauptleitung(WAN1) deaktiviert wird.
Ergbenis: Bestanden

Test 2. Es soll geprüft werden, ob die RED die Verbindung zur UTM automatisch über die Backupleitung (WAN2) wiederherstellt, wenn der die Verbindung Hauptleitung(WAN1) ausgefallen ist, der Link zur Hauptleitung jedoch bestehen bleibt.
Ergebnis: Fehler

Im Nachgang habe wir festgestellt, dass die RED50 es dann doch geschafft hat die Verbindung zur UTM über die Backupleitung herzustellen (nach ca. 20 Minuten!!). Jedoch können keine Daten durch den Tunnel geschickt werden. Im Red connection status wird die Verbindung zwar als Online angezeigt. Das Symbol ist aber mit einem gelben Ausrufezeichen versehen.

Kann dieses Problem jemand bestätigen? Wenn ja, handelt es sich hier um einen sehr ernsten Fehler, da hier die grundlegenden RED Basisfunktionen gestört sind ein ein fixen dringend erforderlich machen.
Uns selber ist der Fehler bei einem Standorten aufgefallen und wir konnten uns diesen nicht erklären. Daher hatten wir den Dienstleister beauftragt den Fehler zu untersuchen.

  • Hi Papi-Sanchez,

    Thank you for reaching out to the Community!

    Could you please tell us if you have noticed this issue after the firmware update, or was it also noticed on the previous firmware version of the UTM?

    If you have a support case number for this issue, please PM me the case number for further investigation.

    Thanks,

  • In reply to H_Patel:

    UTM <--> RED Failover / Backup Problem: Is this a known problem and how do you do it? Link down = ok. Link up but ISP down = malfunction

    Hello again,

    we encountered an amazing problem with the VPN backup function in the interplay between a UTM and RED 50.
    Normally everyone thinks that if I equip a RED 50 with two lines, one the main line on WAN1 and one independent back line (e.g. Fritz-Box with DSL or UMTS) on WAN2, the RED should automatically jump to the backup line if the main line is away.

    Unfortunately, this is not the case in every situation. We have only had to experience this painfully now.

    There are exactly two options for a situation. The first situation works perfectly. The second, however, does not work and is a really bad surprise in my opinion.

    1. The link of the main line goes down. Means that the physical interface goes off. The RED notices this and switches cleanly to the backup line. Everyone will have already checked this with their system by simply pulling the LAN cable of the main line. If the link of the main line is back, the RED switches back to the main line. Everything good so far.

    2. If the link is not down, but only the ISP of the main line goes off, the physical link still exists. And then the problems start. The RED is no longer able to intercept this situation and switch over to the backup line in a functional manner. In my tests, the RED does switch after about 20 minutes, but no longer builds a clean tunnel. As a result, the VPN route is still disrupted and cannot be used. The RED overview now shows the connection with a yellow exclamation mark. The only solution available: pull the LAN cable from the main line.

    We are currently using UTM version 9.703-3. I have already tested the whole thing described here with older versions. Always with the same result.

    Let's ask: it can not be that such a basic function for failover only works 50%?
    Have you checked this before and what did you get?
    Or how do you handle that?

    I soon have the bad assumption that hardly anyone has done such a test (point 2) (only the test as I did in point 1) and that this condition is hardly known or not known at all ?!

    I would be very grateful for information and comparisons. Let's hope for the best. If this is not the case, it would be a real disaster and a no-go for the business sector.

    I am very excited about your posts.
    Thanks everyone.

  • In reply to H_Patel:

    We have now investigated the case further and carried out many experiments.
    It turned out that there were actually errors in the firmware of the RED 50. And not just recently.
    We have made several videos as evidence that I would like to publish here. So the matter is very explosive. Especially with the background that this bug has existed for a long time. And that is the basic function for VPN RED.

    The test passed: UTM Ver 9.606-1 with RED50 FW 5317 (basis)

    The test failed: UTM Ver 9.606-1 and RED50 with USE_UNIFIED_FIRMWARE

    The test passed: UTM Ver 9.703-3 with RED50 FW 5214 (basis).

    The test failed: UTM Ver 9.703-3 with RED50 with current firmware.

    So it is clear that the firmware of the RED 50 is faulty.
    Can someone confirm that?

  • In reply to Papi-Sanchez:

    So as you update the Firmware of the RED to the new unified Firmware, only the interface status will be checked, not the upper level of OSI? 

     

    As this KB state:https://community.sophos.com/kb/en-us/116573#Deployment%20scenarios

    Note:  If any interfaces go down, the interface will be checked until it is working again. The connection will be restored to the original interface if it becomes available again.

     

    It could be only the Interface as a hardware component. 

     

    I do not have any RED50/60 to test this right now, but you should open a support case to point this to the Support.