This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

WAN port LAG does not quire IP address via DHCP

on our SG430 with 2-Port LAG for one WAN line, we do not receive IP address anymore so that WAN line is unusable.

Cluster is HA, Firmware version:        9.711-5

When I disable and re-enable the lag interface, it tries do get IP by DHCP but fails to receive an offer.

I confirmed the ISP router is providing IP addresses.

The LAG terminates at a WAN switch where the router is connected.

In the logs of SG I see:

<M> fw:/home/login # ifconfig -v eth2
eth2      Link encap:Ethernet  HWaddr 00:1A:8C:F0:22:C2
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:31068470 errors:0 dropped:2 overruns:0 frame:0
          TX packets:34109934 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2870126907 (2737.1 Mb)  TX bytes:6440355307 (6142.0 Mb)

<M> fw:/home/login # ifconfig -v eth6
eth6      Link encap:Ethernet  HWaddr 00:1A:8C:F0:22:C2
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:34802665 errors:0 dropped:1 overruns:0 frame:0
          TX packets:763212 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3043897001 (2902.8 Mb)  TX bytes:120283402 (114.7 Mb)

<M> fw:/home/login # ifconfig -v lag3
lag3      Link encap:Ethernet  HWaddr 00:1A:8C:F0:22:C3
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:4000300544 errors:104 dropped:331275 overruns:0 frame:57
          TX packets:1905373056 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:5325821115540 (5079098.8 Mb)  TX bytes:286893189767 (273602.6 Mb)

<M> fw:/home/login # /var/mdw/scripts/dhcpc restart
[dhcpc] :: restart  - from pid=30031, parent_pid=8658(bash)
:: Stop - interface info missing!!
[dhcpc] :: flock released (parent=8658(bash))
[dhcpc] :: flock aquired (parent=8658(bash))
[dhcpc] :: Start - interface info missing!
[dhcpc] :: flock released (parent=8658(bash))
[ failed ]

dhclient runs:

14811         00:00:00 dhclient

2022:10:18-00:01:52 fw-320-2 [user:notice] ' 
2022:10:18-00:01:57 fw-320-2 [daemon:info] dhcp_updown[32355]:  lag3 - reason:FAIL
2022:10:18-00:02:27 fw-320-2 [user:notice] cluster_sync[31896]:   

2022:10:18-13:20:49 fw-320-2 dhclient: DHCPDISCOVER on lag3 to 255.255.255.255 port 67 interval 5
2022:10:18-13:20:54 fw-320-2 dhclient: DHCPDISCOVER on lag3 to 255.255.255.255 port 67 interval 9
2022:10:18-13:21:03 fw-320-2 dhclient: DHCPDISCOVER on lag3 to 255.255.255.255 port 67 interval 7
2022:10:18-13:21:10 fw-320-2 dhclient: No DHCPOFFERS received.
2022:10:18-13:21:10 fw-320-2 dhclient: No working leases in persistent database - sleeping.





This thread was automatically locked due to age.
Parents Reply Children
  • I guess it would be dhcpc ? but that failed - can be seen in my first post.

    I will restart dhcpd.

    <M> fw:/home/login # ps aux | grep dhcp
    root     18219  0.0  0.0   5668   748 pts/0    S+   14:44   0:00 grep dhcp
    <M> fw:/home/login # cat /var/log/selfmon.log
    <M> fw:/home/login # cat /var/mdw-debug.log
    cat: /var/mdw-debug.log: No such file or directory
    <M> fw:/home/login # version

    Current software version...: 9.711005
    Hardware type..............: 430r1
    Serial number..............: S4000xxxxDF1
    Installation image.........: 9.403-4.1
    Installation type..........: ssi
    Installed pattern version..: 214529
    Downloaded pattern version.: 214529
    Up2Dates applied...........: 40 (see below)
                                 sys-9.403-9.404-4.5.3.tgz (Jul  1  2016)
                                 sys-9.404-9.405-5.5.1.tgz (Aug 18  2016)
                                 sys-9.405-9.406-5.3.1.tgz (Sep 26  2016)
                                 sys-9.406-9.407-3.3.1.tgz (Oct  7  2016)
                                 sys-9.407-9.408-3.4.1.tgz (Nov 10  2016)
                                 sys-9.408-9.409-4.9.1.tgz (Jan  4  2017)
                                 sys-9.409-9.410-9.6.1.tgz (Feb  6  2017)
                                 sys-9.410-9.411-6.3.3.tgz (Feb  9  2017)
                                 sys-9.411-9.412-3.2.2.tgz (May 30  2017)
                                 sys-9.412-9.413-2.4.3.tgz (May 30  2017)
                                 sys-9.413-9.414-4.2.3.tgz (Jul 12  2017)
                                 sys-9.414-9.501-2.5.1.tgz (Oct  7  2017)
                                 sys-9.501-9.502-5.4.1.tgz (Oct  7  2017)
                                 sys-9.502-9.503-4.4.2.tgz (Oct  7  2017)
                                 sys-9.503-9.504-3.1.4.tgz (Nov  3  2017)
                                 sys-9.504-9.505-1.4.1.tgz (Nov  3  2017)
                                 sys-9.505-9.506-4.2.2.tgz (Dec 13  2017)
                                 sys-9.506-9.507-2.1.4.tgz (Mar 26  2018)
                                 sys-9.507-9.508-1.10.1.tgz (Mar 26  2018)
                                 sys-9.508-9.509-10.3.2.tgz (Jun  8  2018)
                                 sys-9.509-9.510-3.5.2.tgz (Aug 20  2018)
                                 sys-9.510-9.600-5.5.1.tgz (Apr 11  2019)
                                 sys-9.600-9.601-5.5.2.tgz (Apr 11  2019)
                                 sys-9.601-9.602-5.3.1.tgz (Jul 20  2019)
                                 sys-9.602-9.603-3.1.1.tgz (Jul 20  2019)
                                 sys-9.603-9.604-1.2.1.tgz (Jul 20  2019)
                                 sys-9.604-9.605-2.1.4.tgz (Oct 10  2019)
                                 sys-9.605-9.700-1.5.2.tgz (Jan 11  2020)
                                 sys-9.700-9.701-5.6.1.tgz (Mar 28  2020)
                                 sys-9.701-9.702-6.1.1.tgz (Mar 28  2020)
                                 sys-9.702-9.703-1.3.3.tgz (Sep  1  2020)
                                 sys-9.703-9.704-3.2.3.tgz (Oct 14  2020)
                                 sys-9.704-9.705-2.3.1.tgz (Oct 14  2020)
                                 sys-9.705-9.706-3.8.1.tgz (May 20  2021)
                                 sys-9.706-9.706-8.9.1.tgz (Jul  3  2021)
                                 sys-9.706-9.707-9.5.1.tgz (Sep  8  2021)
                                 sys-9.707-9.708-5.6.1.tgz (Mar 12  2022)
                                 sys-9.708-9.709-6.3.1.tgz (Mar 12  2022)
                                 sys-9.709-9.710-3.1.1.tgz (May 19 17:31)
                                 sys-9.710-9.711-1.5.1.tgz (May 19 17:32)
    Up2Dates available.........: 1
    Factory resets.............: 0
    Timewarps detected.........: 1

    <M> fw:/home/login # rpm -qa | grep dhcp
    dhcp-chroot-client-4.4.1-3.g629f991.rb5
    dhcp-chroot-server-4.4.1-3.g629f991.rb5
    ep-chroot-dhcpc-9.70-14.gde59063.rb5
    ep-chroot-dhcps-9.70-15.ge43a374.rb6
    <M> fw:/home/login # cat /var/mdw/scripts/dhcpd
    #!/bin/bash
    #
    # Copyright (C) 2005-2010 Astaro AG
    # For copyright information look at /doc/astaro-license.txt
    # or www.astaro.com/.../astaro-license.txt
    #
    # Author: Stephan Scholz <sscholz@astaro.com>
    # Maintainer: Ulrich Weber <uweber@astaro.com>
    #
    ##############################################################################

    PATH=/sbin:/bin:/usr/sbin:/usr/bin
    PNAME="DHCP Daemon"
    PROG="dhcpd"
    NOSELFM="/etc/no-selfmonitor/dhcpd"
    CHROOT="/var/chroot-dhcps"

    function usage() {
            echo "Usage: $0 [start|stop|restart|trigger]"
            exit 1
    }

    ret_code=0
    case "$1" in
            start)
                    echo ":: Starting $PNAME"
                    PID=`pidof $PROG`
                    if  [ ! -z "$PID" ] ; then
                            echo "   $PNAME already running"
                            if [ -e $NOSELFM ] ; then
                                    echo "no-selfmonitor file exists so deleting it."
                                    rm -f $NOSELFM
                            fi
                            ret_code=1
                    else
                            read -a INTERFACES < $CHROOT/etc/dhcpd.ifaces
                            chroot $CHROOT /usr/sbin/__dhcpd -cf /etc/dhcpd.conf ${INTERFACES[@]} >/dev/null 2>&1|| ret_code=1
                            if [ $ret_code = 0 ] ; then #only remove $NOSELFM if start succeeded otherwise selfmon will keep spamming
                                    rm -f $NOSELFM
                            fi
                    fi
                    ;;

            stop)
                    echo ":: Stopping $PNAME"
                    touch $NOSELFM
                    killproc -p /var/chroot-dhcps/var/run/dhcpd.pid /var/chroot-dhcps/usr/sbin/dhcpd || ret_code=1
                    num_try=0
                    PID=`pidof $PROG`
                    while  [[ ! -z "$PID" ]] && [[ $num_try -lt 40 ]]
                    do
                            echo "   $PNAME still running"
                            sleep 0.25
                            ((num_try++))
                    done
                    ;;

            restart|trigger)
                    [ "$1" = "trigger" ] && [ -e $NOSELFM ] && exit 0
                    $0 stop ||  ret_code=1
                    $0 start $@ ||  ret_code=1
                    echo -e "\033[33m\033[1m:: Restarting $PNAME\033[m"
                    ;;

            *)
                    usage
                    ;;
    esac

    /var/mdw/scripts/retcode $ret_code
    exit $ret_code;
    <M> fw:/home/login #
    <M> fw:/home/login # cat /var/mdw-debug.log
    cat: /var/mdw-debug.log: No such file or directory
    <M> fw:/home/login # cat /var/mdw/debug.log
    cat: /var/mdw/debug.log: No such file or directory
    <M> fw:/home/login #

    <M> fw-320:/home/login # /var/mdw/scripts/dhcpd restart
    :: Stopping DHCP Daemon
    [ ok ]
    :: Starting DHCP Daemon
    [ failed ]
    :: Restarting DHCP Daemon
    [ failed ]
    <M> fw-320:/home/login #

  • <M> fw-320:/home/login # /var/mdw/scripts/dhcpd stop
    :: Stopping DHCP Daemon
    [ ok ]
    <M> fw-320:/home/login # /var/mdw/scripts/dhcpd start
    :: Starting DHCP Daemon
    [ failed ]

    <M> fw-320:/home/login # /var/mdw/scripts/dhcpd stop
    :: Stopping DHCP Daemon
    [ ok ]
    <M> fw-320:/home/login # ps aux | grep dhcp
    root     28424  0.0  0.0   5668   744 pts/0    S+   14:49   0:00 grep dhcp


    <M> fw-320:/home/login # ps aux | grep dhc
    root      7395  0.0  0.0   7352   192 ?        Ss   May19   0:03 /usr/sbin/dhcrelay -q -i lag1.500 -i lag1.2500 172.16.xxx.xxx
    root     12078  0.0  0.0   7720  2128 ?        Ss   13:55   0:00 /usr/sbin/dhclient -nw -cf /etc/lag3.conf -lf /var/db/lag3.leases -pf /var/run/dhclient_lag3.pid lag3
    root     28550  0.0  0.0   5672   744 pts/0    S+   14:49   0:00 grep dhc

  • did a HA failover - lag3 WAN interface is working again. will fall back to the previous node later and check if the issue is also resolved on that machine.

    <M> fw:/home/login # ifconfig lag3
    lag3      Link encap:Ethernet  HWaddr 00:1A:8C:F0:22:C3
              inet addr:10.1.254.23  Bcast:10.1.254.31  Mask:255.255.255.240
              UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
              RX packets:1189715296 errors:0 dropped:615 overruns:0 frame:0
              TX packets:570845987 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:1567178041026 (1494577.4 Mb)  TX bytes:75816317166 (72304.0 Mb)

  • WAN was also fine after the fall back to the original firewall (rebooted it first).

    So it was some weirdness inside UTM DHCP Client