This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Automatic Monitoring not setting error state on 0% loss 2600ms RTT link.

TL;DR - ISP WAN Error. First jump on a PPPoE chain is 2600ms (?same RTT as moon?) but there is 0 packetloss, only delay.

How do I get the UTM to realize this is a error, and flag the interface as error?

This UTM box has four - 4 - uplinks but since one of the high priority ones are flagged as good with 2600ms RTT everything grinds to a halt.

 

Is there a way to set max RTT for the Monitoring hosts? (I've not seen any so far...)

 

Suggestions?

 

(I realize this might need a hack, but I'm willing to hack it... because this is just stupid.)



This thread was automatically locked due to age.
Parents
  • Hei,

    Show us a picture of the 'Advanced' tab in Uplink monitoring.  Have you tried not selecting 'Automatic monitoring'?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi!

    Always good getting feedback from you, it tends to put it into perspective.

    So... I'm mostly after hacking the Automatic Monitoring, more than putting manual monitoring into play.

    In particular I like: "By default, the monitoring host is the third ping-allowing hop on the route to one of the root DNS servers. However, you can define the hosts for monitoring the server pool yourself. For these hosts you can select another service instead of ping, and modify the monitoring interval and timeout."

    For me that has a mix of PPPoE, 4G/5G and also leased lines; it seems better to hand the host selection to the function above.

    But... When I go in where you suggest; a place I have been before; I notice the "timeout 5".

    Does this mean that the default Sophos UTM bad link RTT timeout value is 5000ms?

    When I read the docs on this I understood that that was the total allowable time for response for any hosts.

    "

    Interval: Enter a time interval in seconds at which the hosts are checked.

    Timeout: Enter a maximum time span in seconds for the monitoring hosts to send a response. If all monitoring hosts of an interface do not respond during this time, the interface will be regarded as dead.

    "

    I did not, and don't think the Timeout is a RTT max value; but more as a timeout for the whole process itself. It could work as one, in my case, if I put it down to like 1 sec... 

    But I'm assuming there is a Uplink Monitoring script someplace ... and I'm more thinking about adding a ..

    1) if value exceeds x N {} of average, flagg as error;

    2) if value exceeds a max of RTT, flagg as error;

    So while your suggestion might be valid, it's not what I'm looking for.

    (Remember that time when there was this value that only could be edited in manual mode, but also stuck in automatic...? ... I might try it... but how to test it...)

    #

Reply
  • Hi!

    Always good getting feedback from you, it tends to put it into perspective.

    So... I'm mostly after hacking the Automatic Monitoring, more than putting manual monitoring into play.

    In particular I like: "By default, the monitoring host is the third ping-allowing hop on the route to one of the root DNS servers. However, you can define the hosts for monitoring the server pool yourself. For these hosts you can select another service instead of ping, and modify the monitoring interval and timeout."

    For me that has a mix of PPPoE, 4G/5G and also leased lines; it seems better to hand the host selection to the function above.

    But... When I go in where you suggest; a place I have been before; I notice the "timeout 5".

    Does this mean that the default Sophos UTM bad link RTT timeout value is 5000ms?

    When I read the docs on this I understood that that was the total allowable time for response for any hosts.

    "

    Interval: Enter a time interval in seconds at which the hosts are checked.

    Timeout: Enter a maximum time span in seconds for the monitoring hosts to send a response. If all monitoring hosts of an interface do not respond during this time, the interface will be regarded as dead.

    "

    I did not, and don't think the Timeout is a RTT max value; but more as a timeout for the whole process itself. It could work as one, in my case, if I put it down to like 1 sec... 

    But I'm assuming there is a Uplink Monitoring script someplace ... and I'm more thinking about adding a ..

    1) if value exceeds x N {} of average, flagg as error;

    2) if value exceeds a max of RTT, flagg as error;

    So while your suggestion might be valid, it's not what I'm looking for.

    (Remember that time when there was this value that only could be edited in manual mode, but also stuck in automatic...? ... I might try it... but how to test it...)

    #

Children
No Data