This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Ethernet Port Issue - State-Up, Link-Error

Hi:

I have a two month old SG330, running  9.315-2.

This has happened four times. On the dashboard, the external interface shows as State - Up, Link - Error. It first happened maybe a month ago. 

Since then it has happened three times, last Thursday (aug 27), last Saturday (aug 29), and today (aug 31st).

What happens then is our external users lose their VPN connections, and we lose internet connections, until our backup internet connection kicks in (have uplink balancing both ISP connections active) and multipath rules to steer users out the primary connection and fail to the backup if needed, and to steer some of our BYOB wifi hotspots out the backup connection as primary).

Anyway, when this happened last Thursday, I changed the ISP port on the Sophos from Eth4 (where it was) to Eth5, and also changed the Ethernet cable (all of 3 feet long) between the ISP's box and the Sophos.

We're on a SM fiber connection, with a vendor provided Cisco 3000 series switch, with three active ports - one for the fiber SFP, one Ethernet for the internet connection, and one ethernet that does our phone PRI. 

When the internet connection goes down, there is no issue with the phone.

As I said, last Thursday, I switched the ISP primary connection on the Sophos from Eth4 to Eth5. 

On Saturday it went down again, and I got a call at home (we are a 24x7operation). I could ping the ISP's DG, but could not connect to the firewall through the primary ISP.   I logged remotely into the SG through the backup internet connection. Same as before. On the dashboard page it showed ETH5as State - Up, Link - Error. I disabled it, and re-enabled it, and it came back up. 

Today, same situation. I was on site, saw it happen (before people started calling), logged into the SG through the internal interface and Eth5 showed the same. Went to the interfaces tab on interfaces and routing, disabled and re-enabled the interface, and it came right back up.

There are several additional addresses on that port also - we have a /28 from the carrier and have some of the other additional addresses NAT-ing to servers, etc. 

Now I'm getting the ISP saying that it's the firewall, etc. 

I had Sophos support on the line today, and the connected in, and - in their words, they see line fluctuations, port not responding, which in their minds shows an issue with the vendor's switch.

The ISP says they show no issues, other than when they see my disable/enable the port. 

I've changed the port on the SG from Eth4 to ETH5, and changed the Ethernet cable. There is nothing other than the Ethernet cable between the two boxes.

I've attached some snippets of log files that Sophos pulled up. 

Ideas? 

Thanks,

John S.


This thread was automatically locked due to age.
Parents
  • I'm experiencing the same problem. Sophos UTM home, Intel NIC.

    Firmware 9.407-3, but it happened on previous versions also.

    eth1 | External (WAN) | Ethernet | Up Error

    No problems connecting to the internet, though.

  • Hi Arie,

    Go to Interface & Routing >>>>> Uplink Monitoring >>>>> then advanced then create NEW Monitoring Host like Google DNS then create host then IP Address 8.8.8.8 then save then apply...

    Note:

    uncheck Automatic Monitoring for you to use google DNS for uplink monitoring

    Regards,

     

    Jason

  • Hi and thanks for your answer. I now understand that I have had two different problems.

    1. Some kind of negotiation problem between my UTM hardware and the fiber box. This problem caused the Link to go down and a restart or a manual disable/enable of the interface restored operation when this happened.
      1. I solved this by simply creating a VLAN with two ports in my Procurve switch and put it in between. Not a beautiful solution but it works. 
    2. Disturbances in the Uplink monitoring that was indicated by the Link being in error. 
      1. I have done this for the uplink monitoring since long. Actually I found out later yesterday that my ISP had disturbances that caused my problems. 

     

    I did not realize that the Link being in Error on the Dashboard could indicate a high level problem from the uplink monitoring. I always thought it indicated a low level ethernet problem. This is very confusing and I think uplink monitoring problems could be indicated in some other way.

    Regards 

     

    Christer

  • Apologies for the delay - been busy moving...

     

    Your solutions worked great! Just three items for my wish list:

    1. Better documentation of this setting (or even a link from the dashboard).
    2. Use of DNS groups.
    3. Overview of which hosts are up/down.

    Also, I noticed somewhere in the documentation that the UTM is able to make different types of requests to hosts (e.g. ICMP, HTTP). Any idea how/where that's configured?

  • We are experiencing the same issue.

    Two ISP lines, both working like a charm if i force the traffic to both Interfaces.
    On the Dashboard Line A is shown as On/Up, Line B as On/Error.

    BGP shows no Error and i can reach any Host on the Internet on both Lines from the Shell as Loginuser.
    I turned automatic monitoring off in uplink monitoring and only inserted Google DNS (8.8.8.8) for test, still the error remains.

    Is there any protocol file logging the reason for the Error State?
    I could still manually force parts of the traffic to Line B, but we would like automatic fallback if one or the other Line fails.

    Kind regards
    Dietmar

Reply
  • We are experiencing the same issue.

    Two ISP lines, both working like a charm if i force the traffic to both Interfaces.
    On the Dashboard Line A is shown as On/Up, Line B as On/Error.

    BGP shows no Error and i can reach any Host on the Internet on both Lines from the Shell as Loginuser.
    I turned automatic monitoring off in uplink monitoring and only inserted Google DNS (8.8.8.8) for test, still the error remains.

    Is there any protocol file logging the reason for the Error State?
    I could still manually force parts of the traffic to Line B, but we would like automatic fallback if one or the other Line fails.

    Kind regards
    Dietmar

Children