rpm installing failes sometimes

We have running a SEC 5.5.2, the updates are configured to be downloaded from a http url.
For Linux agents we created a rpm deployment package like described in:

Testing the rpm on some servers the installation is sometimes not successful. The error found is:
DownloadException: DownloadException for <url> with subexception: [Errno 104] Connection reset by peer

Doing some checks we discovered the following things:
There are quite a lot of files downloaded from the url needed for the installation of the Linux agent.
With tcpdump we checked and see that for every file download a new connection is setup and closed. So there are many connections opened and closed in the same second

Is there an option to configure keep alive settings, so more than 1 file can be downloaded in 1 connection? If this is not an configuration option can it build in the application or is there another way to do this?

Parents Reply
  • Hi Shweta,


    Yeah, I'm sure I can check this issue from a network point of view. I think I will find one of these causes:

    • there are too many close_wait connections on the Sophos console server, so there are no new connections possibly till this number is decreased
    • there is an infra component that blocks the connection, because there are too many connections open and closes in a short time which is suspicious


    Both possible causes can be solved by a keep alive setting, so less connections are opened and closed in a short time

  • Hi Shweta,


    Today I have done a check on the Sophos console server and ran the following command:

    netstat -ano | find /c "WAIT"

    The number is somewhere between 60 and 100. Then I started the rpm install on a linux server, after the install the number is 998. So for the install on a Linux server there are around 900 connections opened and closed. Changing the command slightly to netstat -ano | findstr "WAIT" you see many of these rows

      TCP    <ip address of Sophos console server>:80       <ip address of linux server>:<XXXX>       TIME_WAIT       0

    with XXXX the portnumber the Linux server connects (in this case in the range between 16385 till 36860)


    That is why I think a keep alive setting is important to reduce this number.

  • Hello Jan Jansen,

    don't have a Linux box at hand to test. But I'm pretty sure can tell how it works and whether HTTP updates are indeed performed as a sequence of individual open-fetch connections without keep-alive (implicit in HTTP 1.1) that causes the server to initiate the close (otherwise the TIME_WAIT would be on the client side).


  • Hi Jan,


    I'm afraid SAV Linux's http code doesn't attempt anything clever like keep-alive, so you'll have to fix your networking to not fail connections. Maybe stagger your deployments?