This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Mails stuck in SMTP-spool, if forwarded via smarthost over a certain IPsec site-2-site tunnel

Hello Community,

we have a very nice and puzzling issue with our Sophos UTM and SMTP relay to a smarthost over an IPsec tunnel.

Problem - short version:

Our Exchange server forwards mails to the UTM as smarthost. The UTM forwards the mails to a second (non-public) smarthost via an IPsec tunnel. As we have to Uplinks (UPLINK1 & UPLINK2) everything works, if the IPsec tunnel is build up via UPLINK2. If we use UPLINK1 for this tunnel, no email will be forwarded.

Environment:

  • Our Mailserver Exchange 2010
  • Sophos UTM SG310 (HA-Clustered active/passive) - V9.506-2
  • external Mailgateway: Proventia Mail Gateway - IP: 10.11.22.33

Uplinks:

  • UPLINK1:
    Telekom Business DSL with fixed IP
    BinTEC xDSL-Bridge
    Sophos: PPPoE connection on eth5 - fixed IP
  • UPLINK2:
    Vodafone Cable Business with fixed IP
    Compal Cable-Router (small /30 transfernetwork with public IPs - so called "Bridgemode at Vodafone")
    Sophos: Ethernet connection on eth1 - fixed IP and RouterIP as gateway-address

VPN-Tunnel:
IPsec site-2-site, AES-256, PSK bound to the interface (I need a nother routing metric for that tunnel)
Routing entries to other sites networks available and working

SNAT:
Due to the fact, we have to uplinks in case of an outage of one, and the fact, the IPsec tunnel doesn't allow direkt traffic of the IP-addresses of uplink-initerfaces I created a SNAT rule, which the UTM uses, if the smtp proxy forwards mails into the tunnel to the external mail-gateway.

SNAT: "Uplink Primary Addresses" -> any -> remote-network (uses IP 172.22.55.66, which is allowed via the tunnel)

Mail routing as smarthost - NOT transparent(!):
Exchange 2010 -[LAN]-> Sophos UTM -[IPsec]-> Proventia remote mailgateway

1st scenario:
Exchange 2010 -[LAN]-> Sophos UTM -[IPsec via UPLINK2]-> Proventia remote mailgate
Everything works as exspected!

2nd scenario:
Exchange 2010 -[LAN]-> Sophos UTM -[IPsec via UPLINK1]-> Proventia remote mailgate
Mails stuck in smtp-spool queue and will not be forwarded 

LOGS:

SMTP-Proxy Live-Log:
--------------------
2018:02:02-08:03:51 ptgfw001-1 exim-out[15154]: 2018-02-02 08:03:51 1ehUia-0001GE-92 == john@doe.de R=smarthost_route T=smarthost_smtp defer (-18): Remote host 10.11.22.33 [10.11.22.33] closed connection in response to end of data
2018:02:02-08:04:00 ptgfw001-1 exim-out[15317]: 2018-02-02 08:04:00 Start queue run: pid=15317
2018:02:02-08:04:00 ptgfw001-1 exim-out[15319]: 2018-02-02 08:04:00 1ehUia-0001GE-92 == john@doe.de R=smarthost_route T=smarthost_smtp defer (-53): retry time not reached for any host
2018:02:02-08:04:00 ptgfw001-1 exim-out[15317]: 2018-02-02 08:04:00 End queue run: pid=15317
2018:02:02-08:04:00 ptgfw001-2 exim-out[22820]: 2018-02-02 08:04:00 Start queue run: pid=22820
2018:02:02-08:04:00 ptgfw001-2 exim-out[22820]: 2018-02-02 08:04:00 End queue run: pid=22820

SMTP-Debug Log:
---------------
2018-02-02 08:03:50 MASTER[15110]: 0 messages in work queue, 0 scanners are running
2018-02-02 08:03:52 QMGR[15117]: 1ehUia-0001GE-92 deferred - updated msglog in messages table
2018-02-02 08:04:00 MASTER[15110]: 0 messages in work queue, 0 scanners are running
2018-02-02 08:04:00 MASTER[15110]: 1 messages in output queue with age >2 minutes, running additional queue runner (max. 2)
2018-02-02 08:04:01 MASTER[15110]: queue runner 15317 terminated.
2018-02-02 08:04:10 MASTER[15110]: 0 messages in work queue, 0 scanners are running
2018-02-02 08:04:20 MASTER[15110]: 0 messages in work queue, 0 scanners are running

tcpdump -vvXXeni any host 172.22.55.66:
---------------------------------------
08:03:51.476258  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59709, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e93d 4000 3f06 1734 0a0a 4f4f  E..(.=@.?..4..OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......
08:03:51.729742  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59710, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e93e 4000 3f06 1733 0a0a 4f4f  E..(.>@.?..3..OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......
08:03:52.249914  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59711, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e93f 4000 3f06 1732 0a0a 4f4f  E..(.?@.?..2..OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......
08:03:53.296608  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59712, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e940 4000 3f06 1731 0a0a 4f4f  E..(.@@.?..1..OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......
08:03:55.381357  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59713, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e941 4000 3f06 1730 0a0a 4f4f  E..(.A@.?..0..OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......
08:03:59.553365  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59714, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e942 4000 3f06 172f 0a0a 4f4f  E..(.B@.?../..OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......
08:04:07.905048  In ethertype IPv4 (0x0800), length 56: (tos 0x0, ttl 63, id 59715, offset 0, flags [DF], proto TCP (6), length 40)
    10.11.22.33.25 > 172.22.55.66.36909: Flags [F.], cksum 0xefe4 (correct), seq 206, ack 129, win 913, length 0
        0x0000:  0000 0200 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0028 e943 4000 3f06 172e 0a0a 4f4f  E..(.C@.?.....OO
        0x0020:  0a09 d7fc 0019 902d 94ba 97df e814 dc09  .......-........
        0x0030:  5011 0391 efe4 0000                      P.......

What have I testet so far?

  • telnet via port 25 from Exchange server to UTM smarthost:
    Email created on UTM will be forwarded without any issues
  • telnet via port 25 from UTM to Proventia mailgateway:
    no issues
  • Sending mails directly from Exchange to Proventia Mail-gateway via IPsec tunnel
    Mail-Routing:
    Exchange 2010 -[IPsec]-> Proventia remote mailgateway
    -> Working without issues
  • Replaced the xDSL-Modem with a router, doing the PPPoE-connection and forwarding all incoming connections to the UTM (exposed host)
    -> Mail forwarding via IPsec tunnel no issue (that was my former networksetup, but the router had to less power in case of a complete failover 

Conclusion:
Everything (every service) is working well over the IPsec connection. But if we want to forward mails over the UTM as smarthost, it will only work, if the uplink-interface is UPLINK2. On UPLINK1 (PPPoE session), the mails stuck.

Has any one an idea ?
Is there an issue with ppp-interfaces?

Many thanks!

BR,
Florian

Changes:
5.2.2018: removed empty lines (cleanup); added Mailrouting as smarthost



This thread was automatically locked due to age.
Parents
  • Addendum:

    At the moment we forward the emails from our Exchange directly to the Proventia Mailgatey at the remote-site without using the UTM as relay. It's a workaround as long we cannot forward emails using IPsec over UPLINK1.

    BR,
    Florian

  • Hallo Florian,

    This seems like a VPN issue, not an SMTP issue - is that right?  I'll move this thread to that forum if you think so, too.

    What is the other VPN endpoint - also a UTM?

    Please show us pictures of the Edits of the IPsec Connection and Remote Gateway for both tunnels.  Also, it sounds like you have selected 'Bind tunnel to local interface' in both IPsec Connections, so please also show the Edit of those routes.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Reply
  • Hallo Florian,

    This seems like a VPN issue, not an SMTP issue - is that right?  I'll move this thread to that forum if you think so, too.

    What is the other VPN endpoint - also a UTM?

    Please show us pictures of the Edits of the IPsec Connection and Remote Gateway for both tunnels.  Also, it sounds like you have selected 'Bind tunnel to local interface' in both IPsec Connections, so please also show the Edit of those routes.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Children
  • Hi Bob,

    I guess it's something between. Because every service running over the tunnel is working. Only mailforwarding with our UTM as mailrelay isn't. If we send emails directly from our Exchange to the other Proventia mailgateway on the remote site, it's also running. I think the problem depends on the fact, that now the external interface is a PPPoE instead of an Ethernet interface.

    If I bind the tunnel on our other WAN Uplink, which is still an Ethernet interface, it will work.

    Yes the tunnel is configuered with "Bind tunnel to local interface". And routing is correctly configured. I can provide the screenshots tomorrow (but I have to change a few data because we are public service). The remote side is a Sonicwall.

    We also have to SNAT the IPs of the WAN Uplinks to allow to send traffic from the UTM into the Tunnel. SSLVPN (Remoteuser), which uses also SNAT for that purpose is also working - but it "nattes" the VPN-IP-range to a vaild tunnel ip.

    I mostly believe, it depends on the difference of the 2 WAN uplinks (PPPoE vs. Ethernet), because the UTM uses the next inferface as souce IP. So it could be more a VPN issue than a mail one. Unfortunately my colleague on the other site has no experience with traffic monitoring. so we can't see, which traffic leaves the tunnel. On the UTM, I see only the outgoing traffic not the incoming one.

    Many thanks for your support ...

    BR,
    Florian

  • Good Morning Bob,

    here is our VPN connection (a few data is black):

    Firewall-Rules:

    SNAT:

    WAN1: PPPoE
    WAN2: Ethernet

    Routing:

    If I manually change the local interface of the IPsec connection and the interface route to WAN2 everything works (manual failover).
    But on WAN1 the mails get stuck:

    BR,
    Florian

  • I don't understand the purpose of the SNAT, Florian.  I wonder if you haven't made some incorrect assumptions about how WebAdmin works.

    There are two ways to have automatic failover.  The first, Auto-Failover IPsec VPN Connections, is much easier to create, but failover takes about a minute whereas Sophos UTM multiple S2S IPsec VPN mit Failover – Tutorial (DE) is instantaneous.  Get rid of the SNAT, uncheck 'Bind tunnel to local interface', delete the Static Route and try the easier solution.

    Does that work for you?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob,

    forget about the auto failover. We want the IPsec on UPLINK1 (WAN1). A failover will be done manually, if required.

    We need the SNAT because, if the UTM sends data into the tunnel, it will use the next (best suited) interface for that. In this case IP of UPLINK1 (WAN1). The SNAT translates the source IP to an IP inside the IP-Range, which is allowed to go through the tunnel. Could it be, that I'm wrong here??

    We have checked our configuration with our reseller and 3rd level support. We're all a bit puzzled. cause it works, if we create the tunnel over UPLINK2 (WAN2).

    We also have to bind the tunnel to the local interface, bacause we route an 10.0.0.0 /8 block through the tunnel, but have one /23 block locally routed over a L3-Core-Switch. Cause of the binding on the local, I can set a different higher metric on the /8 block compared to our /23 inside this range. Otherwise IPsec traffic will hve precedence over our local routing (e.g. other remote sites or sslvpn-roadwarriors will access our network.

    BTW: we have other IPsec site-2-site VPNs without local interface binding and automatical failover running without problems. but there, we don't use any mailforwarding via smarthost :-) :-).

    Last but not least: what do you mean concerning the use of WebAdmin?
    Do you mean the trafficflow inside the UTM?

    BR,
    Florian

  • "We need the SNAT because, if the UTM sends data into the tunnel, it will use the next (best suited) interface for that. In this case IP of UPLINK1 (WAN1). The SNAT translates the source IP to an IP inside the IP-Range, which is allowed to go through the tunnel."

    From the IPsec Connection picture, I gathered that the only source IPs allowed through the tunnel were the LAN IPs.  Why SNAT all traffic from your LAN from a specific IP?  Or am I just confused by all of the information that's blacked out?

    "We also have to bind the tunnel to the local interface, bacause we route an 10.0.0.0 /8 block through the tunnel, but have one /23 block locally routed over a L3-Core-Switch. Cause of the binding on the local, I can set a different higher metric on the /8 block compared to our /23 inside this range."

    I see, but have you changed the "VPN Pool" objects to use something other than a 10. subnet?  It's not clear to me why WebAdmin knows to route the 10. traffic through the tunnel - is that in one of your pictures but, again, hidden by black marks?

    "Last but not least: what do you mean concerning the use of WebAdmin?"

    I'm a visual-tactile learner and likely found things to be incongruous because so much information was blacked out.

    Cheers - Bob 

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob,

    I believe, that's hard to understand. But if you want, we can do a teamviewer session. That should be easier and I don't have to care about security issues.

    Tomorrow I'm at work between 7 and 18 o'clock CET. But we can schedule a meeting also during the next nights.

    I will check a few things tomorrow and will post the results.

    BR,

    Florian

  • After sleeping on this, it became clear.  The Static Route aims the traffic for 10.0.0.0/8 out WAN1 and the SNAT puts it into the tunnel - niiice!

    Now, why it works on one WAN connection, but not the other is indeed a mystery.  You can watch the traffic in the tunnel with something like:

    espdump -n --conn REF_IpsSitRemoteSite -vv

    If you don't know the REF_ for the IPsec Connection, you can find it by substituting the blacked out name below:

    cc get_object_by_name ipsec_connection site_to_site 'Name of the IPsec Connection'|grep \'ref

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • You got it, Bob! I'will check that, many thanks!

  • Hi Bob,

    sorry for my lateness. I had a few days of holiday. espdump works great, as you've shown.
    As there's other traffic on this tunnel, I tried to capture it and analys it with WireShark - https://community.sophos.com/kb/en-us/116179. But I can't decrypt it :-(.

    So I will do another attempt the next days and watch it live.

    Many thank for your hint!

    BR,
    Florian

  • Good morning everyone,

    last weekend I did some packet captures with espdump (many thanks for that hint Bob). They showed, that the UTM is starting the mail connection over the IPsec tunnel, which is created on the ppp0 (PPPoE) connection WAN1. But the mailhost of the UTM will not send bigger packets after initiating the connection directly. If I ping with a large MTU (1500) directly from the UTM throuph the tunnel or send even mails from other network devices, everything works.

    It seems, that there is a MTU problem only affecting the mail part of the UTM.

    Our reseller and support partner also analyzed our packet captures and now opened a ticket at Sophos.

    @Bob:
    Now it's clear: It's definitivly a mail issue not a VPN one.

    Many thanks for all your support concerning this issue. I'll keep you updated.

    BR,
    Florian