QoS dropping connections instead of shaping

I need to pick the brain of the experts here, as my knowledge is exhausted at the moment.

Situation:

  • UTM in a datacenter, connected to a 1Gbps uplink, with 40Mbps 95% percentile bandwidth on it. Excess bandwidth usage is charged at a pretty steep rate.
  • Webserver in a DMZ VLAN, that provides software updates (ipk packages) to devices running embedded Linux.
  • Device owners worldwide seem to have decided to "all" run their updates every day in the same 60 minute window.
  • This causes spikes in outgoing bandwidth of over 200Mbps, leading to a hefty bandwidth charge at the end of the month.

I've been asked to do something about this, i.e. "flatten" the spike, so that (1) the charge can be avoided but (2) not impede the downloads too much.

I currently have a bandwidth trottle entry defined on the inside VLAN interface: "server-ip:80 -> internet:1-65535", shared 50Mbps. This works for (1), but not for (2), as new connections can't get through when the 50Mbps is reached. This not only causes errors on the embedded devices, but also there external service monitoring can't get through, goes berzerk and SMS's their server admin out of bed.

A closer look shows that this is implemented by an iptables bandwidth policy, which simply drops without shaping/equalizing. So that doesn't work.

I have tried to replace it with a bandwidth pool definition on the outside interface. This works when I define "any:80 -> internet:1-65535", but not when I define "server-ip:80 -> internet:1-65535" which leads me to believe this runs after NAT, so you can't define a selector on an internal IP anymore. Is this correct? As there are more webservices in that and other DMZ's, a pool on all outgoing HTTP traffic won't work.

Are there any other options that I can try to archieve this?

p.s. FYI, this is not a paid job for me, client is an "open source development community" that survives on donations, I provide network/security support to them for free. ;-)

  • Digging deeper:

    Trottling does (inbound on eth7, de DMZ VLAN):
    -A IN00c350 -m hashlimit --hashlimit-above 262144b/s --hashlimit-name IN00c350 -g INGRESS_POLICE_DROP
    -A INGRESS_ITF00000000 -s 172.18.22.30/32 -p tcp -m tcp --sport 80 --dport 1:65535 -g IN00c350
    -A INGRESS_POLICE_DROP -j DROP

    Bandwidth pool does (outbound on eth0, Internet):
    -A POSTROUTING -o eth0 -j QOS_ITF00000000
    -A QOS_ITF00000000 -s 172.18.22.30/32 -p tcp -m tcp --sport 80 --dport 1:65535 -g QOS_0x1:0x2
    -A QOS_ITF00000000 -s 172.18.22.30/32 -p tcp -m tcp --sport 1:65535 --dport 80 -g QOS_0x1:0x2
    -A QOS_0x1:0x2 -j CLASSIFY --set-class 0001:0002

    In the normal iptables flow, it would do this before it would do the POSTROUTING SNAT, which would suggest this should work. But it doesn't, the postrouting rule doesn't get a single hit. So perhaps the iptables flow for return (established) traffic is different?

  • In reply to Harro Verton:

    If I change the selector from "server-internal-ip:80 -> internet:1-65535" to "server-public-ip:80 -> internet:1-65535", then the bandwidth pool rule starts working.

    Which in this case is utterly useless, since it will shape all http return traffic that leaves the UTM, not only the http traffic from that one internal server...

    This means the QoS marking is done in the wrong place, it should happen in the MANGLE table, before the POSTROUTING.

  • I wonder if you don't want a Download Throttling rule on the DMZ interface that limits bandwidth for each source/destination to a bandwidth that still leaves some available uplink bandwidth on the External interface.  So, for example, if there are usually 40 clients downloading updates, limit each connection to 1Mbps.  That will slow them down noticeably, but they shouldn't get errors.

    The best solution to this would be to get the clients to download their updates at different times.  Not knowing much about the details, I can't offer much insight.

    Cheers - Bob

  • In reply to BAlfson:

    There are a few issues with that:

    1. In terms of scale, the average during the day is about 20-30 connections/sec. The peak this morning, at 07:10, was over 12.000 connections.
    2. The project doesn't have control over the clients. They've built in a randomizer using the devices MAC address to spread the downloads, but most want to download as soon as it is published, and alter the configuration. There are even idiots who use a cron job to constantly check! (I've used fail2ban to move those to the back of the queue with a 12 hour block ;-))
    3. Download trottling doesn't equalize, it just blindly drops. As I wrote above, I tried that. So if you have 25Mbps and 25 connections, it's not 1Mbps each. Unfortunately. A big fast bulk downloader still grabs all bandwidth. And new connections don't cause a redistribution of the bandwidth, their SYN doesn't even get through...
    4. I can't do anything on the external interface, due to SNAT on the return traffic, so I can't select traffic from that one server anymore, at that point the SNAT to the external interfaces public IP has already happened.

    So, the only option I see is a Bandwidth pool on the external interface, as that uses queing and shaping (tc) instead of iptables drop, but that requires traffic marking before SNAT happens. Which probably means a code change: "if selector source is internal, mark in the MANGLE table, else mark in the POSTROUTING table". Not something I have under control.

    My conclusion at the moment is that it can't be done with the current UTM version.

    I have now installed mod_bw on the webserver that serves the downloads, and see if that does better. If that doesn't work either, perhaps the project has to look for some cheap CDN or a few VPSses to offload their downloads.

  • In reply to Harro Verton:

    "So if you have 25Mbps and 25 connections, it's not 1Mbps each"

    I assume that you're not using WAF.

    Cheers - Bob

  • In reply to BAlfson:

    Bob,

    That is not what is required. That would limit a connection to 1Mbps, even if it is the only connection. Which is silly if there is 50Mbps available. Also, that would also use 100Mbps with 100 concurrent connections (unless you combine it with a shared limit, but not sure what is going to drop where then).

    And yes, WAF is in use. Does that have an alternative solution I've missed?

  • In reply to Harro Verton:

    You're right, Harro, that there's no per-connection option for Bandwidth Pools.  I know there's been a feature request for that at Ideas for several years.

    Since WAF proxies the traffic to the web servers, the response packets from the servers have a destination of the IP of the "(Address)" object.  That makes it impossible to use a Download Throttling rule on the internal interface.

    In this situation, the only thing that might work is disabling all QoS rules and just selecting 'Upload optimizer' on the External interface.  Any luck with that?

    Cheers - Bob

  • In reply to BAlfson:

    Ah, stupid me, with the WAF the source is the firewall, not the server.

    But anyway, the problem remains the same, I can't select the flow from that particular server to apply any QoS rules. Disabling everything doesn't help, as the uplink is a full 1Gbps, so any flow will be able to use that, optimized or not. The goal was to limit the flow to 50Mbps, and equally distribute that bandwidth over the connections made.

    I've given up on the UTM at the moment, I don't want to manually start adding iptables rules, and pinned my hopes on mod_bw now...