This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

RED port goes down

Hi all,

I have a RED that unexpectedly shuts down one of the ports.
The port was previously connected directly to a device, currently we put a switch between RED and the device.
But the problem persists. Can someone help?
Here are the messages ... what does the "command" mean?

2023:01:08-23:49:29 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: command '{"data":{"switch_port_status_v2":{"lan3":"Down","lan1":"Down","lan4":"1Gb\/s","lan2":"Down"}},"type":"STATUS"}' 
2023:01:08-23:49:29 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: PORTSTATE LAN1: Down, LAN2: Down, LAN3: Down, LAN4: 1Gb/s 
2023:01:08-23:49:45 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: command '{"data":{"switch_port_status_v2":{"lan3":"1Gb\/s","lan1":"Down","lan4":"1Gb\/s","lan2":"Down"}},"type":"STATUS"}' 
2023:01:08-23:49:45 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: PORTSTATE LAN1: Down, LAN2: Down, LAN3: 1Gb/s, LAN4: 1Gb/s 
2023:01:08-23:51:37 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: command '{"data":{"switch_port_status_v2":{"lan3":"Down","lan1":"Down","lan4":"1Gb\/s","lan2":"Down"}},"type":"STATUS"}' 
2023:01:08-23:51:37 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: PORTSTATE LAN1: Down, LAN2: Down, LAN3: Down, LAN4: 1Gb/s 
2023:01:08-23:51:53 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: command '{"data":{"switch_port_status_v2":{"lan3":"1Gb\/s","lan1":"Down","lan4":"1Gb\/s","lan2":"Down"}},"type":"STATUS"}' 
2023:01:08-23:51:53 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: PORTSTATE LAN1: Down, LAN2: Down, LAN3: 1Gb/s, LAN4: 1Gb/s 
2023:01:08-23:52:25 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: command '{"data":{"switch_port_status_v2":{"lan3":"Down","lan1":"Down","lan4":"1Gb\/s","lan2":"Down"}},"type":"STATUS"}' 
2023:01:08-23:52:25 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: PORTSTATE LAN1: Down, LAN2: Down, LAN3: Down, LAN4: 1Gb/s 
2023:01:08-23:52:41 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: command '{"data":{"switch_port_status_v2":{"lan3":"1Gb\/s","lan1":"Down","lan4":"1Gb\/s","lan2":"Down"}},"type":"STATUS"}' 
2023:01:08-23:52:41 FIREWALL-2 red_server[13285]: R6000XXXXXXXX8: PORTSTATE LAN1: Down, LAN2: Down, LAN3: 1Gb/s, LAN4: 1Gb/s

Thanks

Dirk



This thread was automatically locked due to age.
Parents
  • Hey  ,

    Thank you for reaching out to community, can you also share the system.log , kernel.log during the time red goes down ? And also observe the LED status !

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hi all,

      There are different switches behind every port ... and no loop. It runs for some days without problems ... then we have some minutes link up/link down  (like before without the switch)

    We have also replaced the RED already.

     

    The Kernel Log seems to be empty ... is this possible?

    System Log
    2023:01:08-23:43:01 FIREWALL-2 /usr/sbin/cron[22279]: (root) CMD (/sbin/audld.plx --trigger)
    2023:01:08-23:45:01 FIREWALL-2 /usr/sbin/cron[22683]: (root) CMD ( /usr/local/bin/rpmdb_backup )
    2023:01:08-23:45:01 FIREWALL-2 /usr/sbin/cron[22684]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:08-23:45:01 FIREWALL-1 /usr/sbin/cron[27141]: (root) CMD ( /usr/local/bin/rpmdb_backup )
    2023:01:08-23:45:01 FIREWALL-1 /usr/sbin/cron[27142]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:08-23:46:01 FIREWALL-1 /usr/sbin/cron[27295]: (root) CMD (/usr/local/bin/create_rrd_graphs.plx --acc)
    2023:01:08-23:47:01 FIREWALL-1 /usr/sbin/cron[28658]: (root) CMD (  nice -n19 /usr/local/bin/gen_inline_reporting_data.plx)
    2023:01:08-23:47:01 FIREWALL-2 /usr/sbin/cron[23068]: (root) CMD (  nice -n19 /usr/local/bin/gen_inline_reporting_data.plx)
    2023:01:08-23:50:01 FIREWALL-1 /usr/sbin/cron[28794]: (root) CMD (/var/mdw/scripts/pmx-blocklist-update)
    2023:01:08-23:50:01 FIREWALL-1 /usr/sbin/cron[28795]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:08-23:50:01 FIREWALL-2 /usr/sbin/cron[23546]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:08-23:50:01 FIREWALL-2 /usr/sbin/cron[23545]: (root) CMD (/var/mdw/scripts/pmx-blocklist-update)
    2023:01:08-23:50:59 FIREWALL-2 dns-resolver[10647]: Updating REF_NetDnsIPrep2t :: iprep2.t.ctmail.com
    2023:01:08-23:53:01 FIREWALL-2 /usr/sbin/cron[24098]: (root) CMD (/usr/local/bin/create_rrd_graphs.plx --acc)
    2023:01:08-23:55:01 FIREWALL-1 /usr/sbin/cron[29082]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:08-23:55:01 FIREWALL-2 /usr/sbin/cron[25370]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:08-23:58:01 FIREWALL-1 /usr/sbin/cron[29286]: (root) CMD (/var/aua/update_ad_bg_members.plx)
    2023:01:08-23:58:01 FIREWALL-2 /usr/sbin/cron[25886]: (root) CMD (/sbin/audld.plx --trigger)
    2023:01:08-23:58:01 FIREWALL-2 /usr/sbin/cron[25885]: (root) CMD (/var/aua/update_ad_bg_members.plx)
    2023:01:08-23:59:01 FIREWALL-2 dns-resolver[10647]: Updating REF_NetDnsIPrep2t :: iprep2.t.ctmail.com
    2023:01:09-00:00:01 FIREWALL -- MARK --
    2023:01:09-00:00:01 FIREWALL-2 /usr/sbin/cron[26399]: (root) CMD (/sbin/hwclock --systz --utc)
    2023:01:09-00:00:01 FIREWALL-2 /usr/sbin/cron[26404]: (root) CMD (/usr/local/bin/logcontrol.sh)
    2023:01:09-00:00:01 FIREWALL-2 /usr/sbin/cron[26408]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
    2023:01:09-00:00:01 FIREWALL-2 /usr/sbin/cron[26405]: (root) CMD (/var/chroot-smtp/cron/expiredletterscleanup.bin)


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • Hey   could be possible, that is fine.
    > Check when the RED last disconnected or connected:
     #grep -i <RED-ID> /var/log/red.log | grep -i "connected“
    > RED-ID = Serial Number of RED
    > Check also archived logs for when a RED last disconnected or connected :
     #zgrep -i <RED-ID> /var/log/red/year/month/* | grep -i "connected"
    > Check blink code of the RED
    =============
    > telnet <hostname of the UTM> 3400
    > telnet red.astaro.com 3400
    Result: Connection timed out
    -> In this case it seems like there is a firewall/router in between which blocks the TCP connection on port 3400
    Result: Unknown host
    -> Check which DNS forwarder and UTM hostname is configured

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hi Vivek,

    there is only one "connected" message within the last 7 days (but for another RED - there are 5)

    I (and the customer) can't check the LED's, because after days without problem, we have only 5 minutes.


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • There are different switches behind every port ... and no loop. It runs for some days without problems ... then we have some minutes link up/link down  (like before without the switch)

    The question may sound silly, but maybe faulty LAN cable? Or a grounding issue if both switches are in different parts of the building?

  • Can you PM me the support access id, let me check if I can find anything suspicious in the historical logs...
    Just let me know the date and time frame 2-3 instances that occurred ! 

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • Hi,

    thanks, but this is not so simple ... because we have an isolated environment.
    I have to request additional external access.
    You may remote my Notebook. Possible, I can create a VPN.

    BTW: it works for the last 5 years.


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • did you notice such a log line under the historical logs  "Missing keepalive from reds1?" and I believe the telnet steps would be also useful to diagnose the situation !  

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • no such message the last 7 days ... until we change a cable today 11:51

    The second RED-LAN-Port stay active ... all the time. 

    Telnet red.astaro.com 3400 should not be possible from RED, because the RED has no internet connection. 
    The RED only reach the UTM-IP-Port (unfiltered).  Routing using this Port is not possible.

    There is a second RED within the same MAN, with the same restrictions, connected to the same UTM ... working without problems.


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

Reply
  • no such message the last 7 days ... until we change a cable today 11:51

    The second RED-LAN-Port stay active ... all the time. 

    Telnet red.astaro.com 3400 should not be possible from RED, because the RED has no internet connection. 
    The RED only reach the UTM-IP-Port (unfiltered).  Routing using this Port is not possible.

    There is a second RED within the same MAN, with the same restrictions, connected to the same UTM ... working without problems.


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

Children
  • did you tried deleting that RED's existing configuration and adding them again ? 

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • We replaced the RED50 with SD-RED60 ... without success. (ok, it works most of the time .. but we see the same error)


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • What is the version of the UTM 8 and are the patterns on UTM 9 are up2date ?
    to check the latest available patterns on the UTM 9 go to Management > Up2date > Configuration > change the Pattern Download/Installation interval to Manual then when you visit the Overview page you'll be able to see the latest available pattern. 

    And everything is up2date, then may if you get a chance try connecting the RED with a console to check the console logs !

    Thanks & Regards,
    _______________________________________________________________

    Vivek Jagad | Team Lead, Global Support & Services 

    Log a Support Case | Sophos Service Guide
    Best Practices – Support Case


    Sophos Community | Product Documentation | Sophos Techvids | SMS
    If a post solves your question please use the 'Verify Answer' button.

  • We run 9.7.12-13. But the Problem starts while using  9.7.11-05 (not with the time as this version is installed ... some time later).

    Pattern up2date are downloaded every 15 min. I'll check the pattern-version asap. 

    (other RED devices work with this UTM) 


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • Hallo Dirk,

    It seems to me that the UTM can't fix this problem and that it's something specific to the environment of the RED.  If you switch the cable from LAN3 to one of the unused ports, does the same thing occur?  How about changing theport on the internal switch that the RED is connected to?  Changing the switch?  Changing the cable?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob,

    we switched the ports yesterday and wait for the results. 

    Something must occur, because we place the switches between the RED and the "device" because we have the "link up/down" problem.

    Since the "device" behind the RED is a UTM-Custer too, i wanted to see if the link-problem comes from RED or this UTM-Cluster.
    Seems the link goes down between RED and Switch and not Switch and UTM-Cluster.


    Network sketch:  UTM --- MAN --- RED(WAN)-RED(LAN) --- Switch --  UTM(HA)Node1
                                                      \--- Switch --  UTM(HA)Node2        


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • Uhm, quite an important information.

    We see this behavior with our UTM clusters too.

    The active node is steadily connected to the Uplink, the passive node is not. It just regularly polls the presence of the uplink.

    Network sketch in this case:

    Internet --- Telekom Router --- DMZ switch 1 --- UTM node 1
                    (BRG)           |
                                    | (fiber link to other building)
                                    |
                                    DMZ switch 2 --- UTM node 2

    In this scenario only the node with active role is steadily connected. The other one is prepared to take over both MAC and IP, but appears to be down at switch port level.

  • Hi, thanks for the Answer.

    like stated ... the link goes down between RED and switch (physicaly) and not between switch and UTM-Node.

    As we can see from the disruptions, the active node is also or only affected.

    And the third location ... also with a Sophos Cluster connected over RED (exactly like the problem-location) runs without any problem.


    Dirk

    Systema Gesellschaft für angewandte Datentechnik mbH  // Sophos Platinum Partner
    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • Sorry, I'm out here - no more ideas.

    I think your best bet is if the switches are somewhat manageable - at least so far that they can throw a log.
    They normally write why they take a port down - i.e. STP, loop protectopn, packet storm or RX errors.