This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Problems with HA (again)...the HA ports seems to go down randomly

Hi Guys,

i have 2 XC 135 with the last firmware update installed on both of them.

The HA connection is made with the QuickHA funktion, in Active-Passive configuration, using a 7.5m straight cable on prot 7 as dedicated HA connection........the HA connection was normally working since two days ago, when i had a problem (i was working from home) i was kicked out from the VPN and the HA config ping from the Primary to the secondary a couple of times.

The day after, back in office, the Auxiliary device was shutted out.....i had to disconnect pysically the firewall from the current and reattach it.

after the restart........using the SSh console, i need to restat it again and the HA was again ok.

After a couple of hours, i had again the ping of the functionality between the Primary and the Auxiliary and the Peer start to be in Fault alarm.

i finished my idea.........i don´t know anymore what to do......here below the logs that i found:

any ideas?

PS (the ha was estabilished using the instruction in this support call https://community.sophos.com/xg-firewall/f/discussions/123582/ha-doesnt-work-in-any-conditions/450470#450470)



This thread was automatically locked due to age.
Parents
  • Can you log to both appliances via SSH and check the dmesg for this Interface. 

    dmesg | grep Port7 | less
    

    __________________________________________________________________________________________________________________

  • hi LuCar, take a look on the images below....this is the auxiliary....

    image 2

    ........

  • Hello Tyler,

    Thank you for contacting the Sophos Community!

    Please open a case with support and provide me the Case ID.

    Most likely you might be affected by NC-64907, there should be a patch available.

    To confirm if you are affected by this please, gather some console logging from the Aux device.

    Note: Be sure that the computer in question does not go into Standby or Hibernate while logging.

    Using PuTTY, go to 'Session' - 'Logging.'
    Here, select "All session output', and set the file name to a folder and name for later retrieval.
    Configure the Serial connection to use the proper COM port on your PC and a Speed of 38400.
    Start the session, and log in to ensure it is all proper.
    Once logged in, you can leave it there or log out and leave the session at the password prompt. Either way, leave the session active and allow it to capture the output from the next reboot.
    Once that reboot occurs, you can end the Serial connection and provide the logs to support further investigation.

    Also provide applog.log, csc.log, syslog.log, msync.log and networkd.log from the Aux device.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • Sorry Guys, i was in Forced Holidays
    i proceed with the Log Grabbing now.....BTW the Peer is in Hang, so basically if i try now to SSH directly on the Peer Device....all the options are in the state "Please try after some times. System is initializing"

  • i didn´t have time to search the into the menú....

    Sophos Firmware Version SFOS 18.0.3 MR-3

    Main Menu

    1. Network Configuration
    2. System Configuration
    3. Route Configuration
    4. Device Console
    5. Device Management
    6. VPN Management
    7. Shutdown/Reboot Device
    0. Exit

    Select Menu Number [0-7]: [ 260.481168] BUG: unable to handle kernel NULL pointer dereference at (null)
    [ 260.504696] IP: (null)
    [ 260.514430] PGD 1f1ae0067 P4D 1f1ae0067 PUD 0
    [ 260.527790] Oops: 0010 [#1] SMP NOPTI
    [ 260.538807] Modules linked in: nf_conntrack_ipslb nfnetmap_queue(O) xt_xfrmpolicy ah4 xt_addrtype nf_nat_ftp nf_conntrack_ftp xt_CT arpt_arpreq_proxy arpt_arpreply_proxy ebt_vlan ebt_arp ebtable_filter ebtable_nat ebtables ip6t_MASQUERADE xt_muser xt_conntrack xt_l4proto xt_auxtoprim_send xt_RCV_SYN_DATA ip6t_ADVERTISEMENT ip6t_SOLICITATION xt_LBS ip6table_filter iptable_filter xt_DNAT xt_SNAT nf_nat_masquerade_ipv6 xt_nat_lookup xt_UST xt_ust xt_firewall nat_rules sfos_rules_framework firewall ip_set_hash_mlmwsticky ip_set_hash_sslvpn iptable_mangle ip_set_hash_mac ip_set_hash_bw nf_conntrack_dns nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_tftp nf_conntrack_tftp nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp cfg80211 usbhid hid_generic hid ohci_pci ohci_hcd xhci_pci
    [ 260.750619] xhci_hcd uhci_hcd ehci_pci ehci_hcd fw_handle_ngfw_notification fp2sp_api fp_notifier bonding lzo lzo_compress lzo_decompress cifs red red2 appdev nf_conntrack_netlink nf_nat_proto_gre nf_conntrack_proto_gre set_sessiontbl sessiontbl ip_gre gre ipcomp xfrm_ipcomp esp4 xfrm4_mode_transport xfrm4_mode_tunnel xfrm4_tunnel xfrm_user af_key xfrm_algo aesni_intel glue_helper aes_x86_64 crypto_simd cryptd cls_u32 act_mirred sch_ingress ifb sch_hfsc sch_leafprio sch_headprio sch_sfq sch_htb xt_MULTISET xt_MLM xt_SRCNETMAP xt_MARKROUTE xt_CONTINUE xt_LOGDROP xt_ULOG xt_TCPMSS xt_REDIRECT nf_nat_redirect ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_OUT_OUTDEV ip6t_rpfilter ipt_rpfilter ebt_nflog ebt_pkttype xt_serviceset xt_appset xt_hostset xt_pkttype xt_recent xt_state xt_status xt_cet xt_OUTDEV
    [ 260.961993] xt_iprange xt_limit xt_hashlimit xt_tcpudp xt_multiport nf_conntrack_relate xt_IPMACFILTER xt_RANGENAT xt_VHDNAT ip_set_bitmap_vhost xt_FWSET xt_set ip_set_hash_maciface_fp ip_set_hash_ipiface_fp ip_set_bitmap_hotspotuser ip_set_hash_hotspotmac ip_set_bitmap_tlsrule ip_set_bitmap_appset ip_set_bitmap_fwrule ip_set_bitmap_ctrxss ip_set_bitmap_user sp2fp_api ip_set_bitmap_userpolicy ip_set_hash_ipuser ip_set_bitmap_service ip_set_bitmap_host ip_set_hash_ipmaciface ip_set_hash_l2mac ip_set_hash_ipmac ip_set_hash_ip ip_set arptable_filter arp_tables e1000e_nm(O) igb_nm(O) i2c_algo_bit ixgbe_nm(O) vxlan udp_tunnel ip6_udp_tunnel ptp pps_core mdio i2c_i801 i2c_dev i2c_core netmap(O) ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat iptable_raw nf_nat_ipv4 xt_dscp nf_nat ip6_tables
    [ 261.175281] ip_tables tun af_packet 8021q nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 ip6_tunnel tunnel6 sit ip_tunnel tunnel4 ppdev parport_pc parport nf_conntrack lineartable bitmap_api br_netfilter bridge nf_defrag_ipv4 ipv6 stp llc x_tables nfnetlink button evdev [last unloaded: nfnetmap_queue]
    [ 261.256603] CPU: 1 PID: 10971 Comm: winbindd Tainted: G O 4.14.38 #2
    [ 261.279112] Hardware name: Sophos XG/Default string, BIOS 5.13 (Z161-009) 08/09/2018
    [ 261.302398] task: ffff8801e6fe1700 task.stack: ffffc900023a8000
    [ 261.320203] RIP: 0010: (null)
    [ 261.331504] RSP: 0000:ffff8801ffc83e18 EFLAGS: 00010202
    [ 261.347227] RAX: ffffffffa07ab700 RBX: ffff8801f806ec80 RCX: ffff8801f368ec00
    [ 261.368669] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8801f806ec80
    [ 261.390108] RBP: ffff8801f368ec10 R08: 0000000000000001 R09: 0000000000000001
    [ 261.411546] R10: 0000000000000000 R11: ffffc900023abbf0 R12: ffff8801f5268000
    [ 261.432984] R13: ffff8801f5268078 R14: ffff8801f52680a0 R15: 0000000000000008
    [ 261.454425] FS: 0000000000000000(0000) GS:ffff8801ffc80000(0063) knlGS:00000000f6bc3e40
    [ 261.478737] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
    [ 261.496026] CR2: 0000000000000000 CR3: 00000001e6a82000 CR4: 00000000001406e0
    [ 261.517465] Call Trace:
    [ 261.524864] <IRQ>
    [ 261.530973] ? ip_rcv+0x316/0x4c0
    [ 261.540978] ? ip_local_deliver_finish+0x1d0/0x1d0
    [ 261.555403] ? __netif_receive_skb_core+0x3ec/0xac0
    [ 261.570086] ? check_preempt_wakeup+0x90/0x1f0
    [ 261.583467] ? process_backlog+0x86/0x120
    [ 261.595548] ? process_backlog+0x86/0x120
    [ 261.607632] ? net_rx_action+0xcc/0x270
    [ 261.619197] ? __do_softirq+0xc5/0x1ec
    [ 261.630501] ? do_softirq_own_stack+0x2a/0x40
    [ 261.643622] </IRQ>
    [ 261.649988] ? do_softirq.part.2+0x3c/0x40
    [ 261.662333] ? netif_rx_ni+0x1d/0x30
    [ 261.673117] ? dev_loopback_xmit+0xa3/0xc0
    [ 261.685461] ? ip_mc_output+0x176/0x240
    [ 261.697025] ? ip_finish_output2+0x3b0/0x3b0
    [ 261.709887] ? ip_send_skb+0x10/0x40
    [ 261.720673] ? udp_send_skb+0x94/0x240
    [ 261.731977] ? udp_sendmsg+0x2f8/0x8c0
    [ 261.743282] ? release_sock+0x3b/0x90
    [ 261.754324] ? sock_sendmsg+0xe/0x20
    [ 261.765106] ? SyS_sendto+0xad/0x150
    [ 261.775894] ? ep_poll_wakeup_proc+0x20/0x20
    [ 261.788757] ? compat_SyS_socketcall+0x12c/0x210
    [ 261.802660] ? do_int80_syscall_32+0x58/0x100
    [ 261.815781] ? entry_INT80_compat+0x48/0x50
    [ 261.828385] Code: Bad RIP value.
    [ 261.838388] RIP: (null) RSP: ffff8801ffc83e18
    [ 261.854113] CR2: 0000000000000000
    [ 261.864116] ---[ end trace 08955a45c855e07f ]---
    [ 261.878019] Kernel panic - not syncing: Fatal exception in interrupt
    [ 261.897132] Kernel Offset: disabled
    [ 261.907668] Rebooting in 3 seconds..
    [ 264.894714] ACPI ME

  • You say there's a known bug causing this; can you give more detail?  Does it only affect XG330s?  Is there a workaround?

    CTO, Convergent Information Security Solutions, LLC

    https://www.convergesecurity.com

    Sophos Platinum Partner

    --------------------------------------

    Advice given as posted on this forum does not construe a support relationship or other relationship with Convergent Information Security Solutions, LLC or its subsidiaries.  Use the advice given at your own risk.

  • Guys the Sophos Support told me that is a KNOWED BUG that will be fixed with a new Firmware (for my XG the MR4), but the firmware itself is not available yet...

  • Hello Stefano,

    Thank you for the follow-up!

    I am not sure which Known Issue ID she mentioned, so I have asked for an update.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • Hello Bruce,

    There is this Known issue ID NC-64907, but there is no specific appliance being affected by this.

    There is a patch available and resolved on MR4.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • thank you Emmanuel, can you confirm that i am not able to download the MR4 from anyware at the moment?

    Thank you in advance...

  • Hello Stefano,

    At the moment MR4 hasn't been released. 
    However since there is a patch available, I told the engineer assigned to your case to get the case escalated to get the patch applied.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
Reply Children
No Data