This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[BUG] XG v16/17 PPPOE doesn't endlessly tries to reconnect

Hi All,

i have a weird issue concerning clients with xDSL lignes, with a modem bridged and connected to wan port of the XG.

There are more and complains about internet failures (different clients, different DSL providers, different modems), and in most of the cases the pppoe Wan port status is "Disconnected".

Just by clikcing on "connect" connection goes up immediatly !

Is there a way to force the XG to retry enlessly to reconnect ?! (which should be a basic thing !!!!)

Thanks !



This thread was automatically locked due to age.
Parents
  • Hello All,

    If you are experiencing issues related to this, please raise a support case mentioning the tracking ID NC-62029 and also PM me with your support case number for further investigation and follow up.

    As a temporary workaround you can perform the following:

    • Set a specific time when the PPPoE link reconnects (May work for certain situations)
    • Manually save the PPPoE configuration or connect manually from the UI

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
Reply
  • Hello All,

    If you are experiencing issues related to this, please raise a support case mentioning the tracking ID NC-62029 and also PM me with your support case number for further investigation and follow up.

    As a temporary workaround you can perform the following:

    • Set a specific time when the PPPoE link reconnects (May work for certain situations)
    • Manually save the PPPoE configuration or connect manually from the UI

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
Children
  • Hi,

    I haven't tried this yet as if i broke it an onsite visit would be required to resolve and that isn't possible at the moment (thanks COVID19!), but one idea I had was to inject "persist maxfail 0" into the pppd commandline, which would keep pppd running and forever retrying to reconnect instead of exiting after a link failure and having XG manage the reconnection (which is where this is going wrong). This could be done like:

    mount -o remount,rw /
    mv /usr/bin/pppd /usr/bin/pppd.orig
    cat <<EOF >/usr/bin/pppd
    #!/bin/sh
    /usr/bin/pppd.orig $@ persist maxfail 0
    EOF
    chmod +x /usr/bin/pppd
    mount -o remount,ro /

    The above is completely untested though, so could contain all kinds of typo's and other errors. Also, XG expects pppd to exit after a connection failure so it can clean up the old connection etc, so having pppd hang around and create a new ppp session when XG doesn't know this is happening could upset things and introduce all kind of strange behaviors, especially if you have a dynamic IP (do ISP's still do that?).

    And of course Sophos Support won't want to touch your device if you've been tinkering!

    James

  • hello  

    case #10039810 opened for a xg106 17.5 mr12 with a state disconnected on a pppoe link. Fortunately it's not used anymore, i hope support will come back to me quicly as i can let the firewall in this state !

  • Hi there

    I have logged 2 cases today for this issue. Although I have 3 units that have experienced the problem, one of the units is not under maintenance.

    One unit is an XG230 Running 17.5.MR9. This unit presently has an uptime of 159 Days. It was a hardware upgrade from an XG125 in March 2020 and was running fine until 2 weeks ago and has experienced the problem 4 times in 2 weeks.

    The other unit is an XG210. Running 17.5.MR8. This unit has an uptime presently of 211 Days, has not been upgraded initially due to scheduling issues with the customer, but recently due to known issues with firmware which will impact their operations. Today is the first occurrence of the issue there. This site is a dedicated fiber link, no other issues, just the PPPOE dropping at 4:50am and no attempts for a reconnect.

    Given the details on these 2 units, this has to be an induced bug from either the SQL Injection fix or the HTTP/S Bookmarks retirement. I also continually see Alert messages about these fixes being applied, this is now several months after the fix was released. The alerts have been viewed, but are still listed as under 5 minutes ago. 

    With ISP's moving to using PPPOE on dedicated links like Fixed Wifi and Fiber rather then fixed IP, this issue needs serious investigation now

    Regards,

    Gavin Daniels. DipIT(Networking)

     

     
  • Hello Guillaume,

    Thank you for the follow-up.

    I see the ticket was open the ticket today and has been sent to review to a senior engineer as an escalation for review.

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • Hello Gavin,

    Thank you for contacting the Sophos Community.

    I have followed up with both engineers and replied to your PM.

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • hi  

    We haven't hear nobody's back on the pppoe issue !!!

    Another 2 clients impacted last sunday, i will consider to send the bill to sophos to cover the extra hours overpaid of my technicians, only because sophos is not able to maintain or reconnect a pppoe link, which a simple 20€ modem can !!!!

    this is not acceptable (again), and the time needed to troubleshoot this major issue is not more acceptable.

  • Hi,

    have you started a support case for each of your affected customers?
    ian

     
    V18.5.x - e3-1225v5 6gb ram with 4 ports - 20w. 
    If a post solves your question use the 'This helped me' link.
  • hi  

    no, i can't spend my time to open ticket for a system issue, for dozens of XG, that does not seems to be seriously handled by sophos, 

    i opened one ticket, with an ideal case because the pppoe link could stay disconnected for investigations. and i have no news since days.

  • Yes a 20 € modem should do better than my XG to maintain the connection.

    I personally plugged back my provider's box (you know we need to work sometimes).

    I really think my last option is the change for another brand.

  • Hey,

     

    I received an email from Support just before the migration to the new Support Portal. They looked at the unit which has had only the single failure and deemed it no problem. Then requested information on how to consistently generate the fault.

    They don't appear to have looked at the logs for the XG230 unit which has exhibited the issue 4 times.

    Support have also failed to understand that with Victoria in Stage 4 Lockdowns due to Covid-19, the reliance on VPN facilities here is high, and having a unit fail and require manual connection restart is an issue. Especially since at times there is no staff allowed in the offices of the customers.

    While I read and appreciate you have been experiencing the issues for some time, The 2 units under maintenance and 3 units with no maintenance that my customers are seeing the problems with, have only started in the last 2 months. Given the older firmware, uptimes in excess of 100 days (AN XG210 with an uptime over 200 days), and no previous issues to report, should assist in pinpointing the problem for everybody.

    While it may not be a hotfix that is the cause of the issue, they have certainly made the issue more prevalent.

    I have just recovered my service spare from being onsite at another customers. I am going to set it up with 17.5.9 and write my config for it and run it in the office and see if I can generate the fault, then they can debug with it.

    Regards,

    Gavin Daniels. DipIT(Networking)