As multiple people expierenced in this EAP a issue in Lets Encrypt in SFOS, i wanted to write down some thoughts and how to initially debug LE in SFOS before we could potentially find a bug.
Sophos Firewall: v21.0 EAP1: Feedback and experiences (EAP Thread)
First of all, a little recap of Lets Encrypt:
Release Notes:
Let’s Encrypt Certificate Support – A long-requested feature, Let's Encrypt certificate support enables the automatic deployment and renewal of certificates based on certificate signing requests (CSRs). Let’s Encrypt certificates are supported for WAF, SMTP, TLS configuration, hotspot sign-in, the Web Admin console, user portal, captive portal, VPN portal, and SPX portal.
SFOSv21.0 LE is very similar to the implementation from UTM9.
You can start a new LE certificate for a domain, for example: test.domain.com. You need to be the owner of this domain and add a DNS record for this FQDN. test.domain.com needs to point to the firewall (WAN). The firewall will try to request the certificate for test.domain.com and LE will reach out to the configured DNS. If this worked, you will get a valid certificate, you can use everywhere. The firewall will automatically refresh the certificate, if needed and there is no user interaction required.
For this concept, you do not need a subscription - Base Firewall is fine.
You cannot generate a wildcard certificate (*.domain.com) - This concept needs an DNS API renewal. SFOS (like UTM) supports only HTTP based renewal - Which limits to one domain per request. You can generate multiple FQDNs per firewall, if needed.
You cannot download the certificate and reuse it somewhere else. For this concept, you should review a own method like certbot or lego.
Lets dive into the working of Lets Encrypt.
SFOS uses the HTTP Challenge principle. You find more information about the inner works of LE here: https://letsencrypt.org/docs/challenge-types/
Quick recap: LE will try to reach the Firewall via HTTP and except a certain Token hosted on the Firewall.
- For hosting the Token, SFOS will use the WAF / Reverse Proxy.
- If you create a LE certificate and/or the LE Certificate needs to be renewal (every 90 days), SFOS will create a WAF Rule on top of the Rule set.
- To create a certificate, you need to have a domain (FQDN). This FQDN will be added to the CSR.
- Then it will try to reach out to LE to start the process. SFOS try to register the FQDN used.
- After reaching out to LE, LE will try to challenge the WAF from the internet from a unknown IP via HTTP (Port80). It will try the used FQDN and expect the WAF to be reachable on this FQDN.
- If LE can reach the WAF on Port 80 and see the correct token, it approves the Firewall as the owner of the domain. SFOS will remove the WAF rule after approval.
On multiple steps in this process, the Firewall or other components in the network can be wrongly configured and break the challenge.
To give some steps to debug LE challenges and resolving it yourself:
- Check via external DNS Check, your domain (FQDN) resolves to your firewall or the router in front of your firewall.
- Try from a client in the internet to reach the firewall on port 80 and see, if you see those packets in the packet capture of the firewall (Webadmin) - If you cant see them, likely the router or something in front of the firewall is already blocking them / not forwarding them.
- Review your NAT policy - Every NAT policy based on HTTP from WAN to LAN will fetch the LE requests and destroy the challenge.
- Check the logs of the firewall (Download them via webadmin) - Interesting is reverseproxy.log (WAF) and letsencrypt.log (LE)
- Check for Country blocking as well - As LE uses unknown IPs, they could be blocked by a device in front of SFOS.