This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SEC connections problem

There are 1 SEC server , 1 DB server and 3 SUM &RMS servers provide sophos endpoint service for about 30000 clients(these clients are all in 1 network range) in my company.

Maybe the system performance bottlenecks of the servers , only 10000 clients can be displayed as connected on the SEC server.

Is that any good way to enlarge the connections between SEC with clients ?

I have an idea:

3 SUM servers, I use 1 SUM server to act as SUM (configurate the RMS service , but in the mrinit.conf file , configure other RMS servers' IP address ),and the rest 2 SUM servers to act as only RMS server(no provide SUM service). Do you think this way can improve the accounts of client connections of SEC server?

regards,

Benkit

Performance bottlenecks



This thread was automatically locked due to age.
Parents
  • Hello Benkit,

    according to Enterprise Console: configuring message relay computers a correctly configured message relay with sufficient resources should be capable of supporting 8000 endpoints. With an even distribution you'll get 7500 endpoints per server, while 8000 is not a hard limit you might or might not encounter issues. What's more, if an endpoint does not take down its connection correctly this will result in dead connection endpoints on the server and eventually resources will be exhausted.  

    I'd try to get the number of direct endpoint connections to a server (management or relay) to 5000 or below. Thus I'd recommend at least six relays (it'd anyway be a good idea to reduce the load on the management server). Just my personal opinion though, I'm not Sophos.

    Christian

  • Thank you QC, I think you are very familiar with Sophos product~

           I find it is hard to do my sophos maintenance daily work :(  , do you have any suggestion to me ?

    1 .  I need to do : "move clients from unassign group to created group" day by day , I don't know why sophos does not have the function to automatically assign clients to diffrerent groups by IP range or key name match?

    2.   We have almost 100000+ clients in our company, we set 100 SUM+RMS and 2 SEC servers to serve these , but only 50000 clients can be connected at the same time  , lots of clients in my company cannot be update in time ,  there are many clients still be Anti-virus version : 10.3.15 (the newest version is 10.6.3)

  • Hello BenkitShi,

    don't ask me why there is no automation interface for SEC (I can surmise a number of reasons), it might or might not come.

    Have you ever considered Sophos Professional Services? I know, management is usually not fond of extra support fees in addition to the license costs. I'd at least request some sizing information from the seller (Sophos or the partner/reseller) for free considering the volume of your license.

    Christian

  • Hi,

    Hello,

    I know that pro-services have a IP range to SEC group mapper you could enquire about.

    Also it's possible for the client to "inform" the management server what SEC group it should be in and therefore what policies the client should have.  This can be done at install time of the client with the -G switch to setup.exe (https://community.sophos.com/kb/en-us/12570). You can also do it post install with the same registry key setup.exe sets before it is read by the Sophos Agent service at initial install/start.  Given this information it would be possible to run a script at the client with some logic on what SEC groups are available and under what conditions the client should be in them, e.g. IP range, name contains, domain, user, etc...

    As for the connection state, I assume that the 30K clients are split evenly between the 3 relays and the clients are fixed to the relays.  I.e. the client 'parentaddress' registry value for the client router is not configured with a comma separated list of relays for redundancy?  The client to message relay is a 1-1 mapping?

    You don't mention the OS of the relays and the management server, what are they? The number of ephemeral ports that can be assigned by the OS is important.  Are there any warnings in the Windows event log from TCP about the number of ports being exhausted at any time?

    Could you export and paste here the \router\ registry key for the 3 relays and the management server?  Just to check they are configured optimally.

    My post in this thread may also provide some help at understanding a little more about RMS: https://community.sophos.com/products/endpoint-security-control/f/16/p/9148/18158#18158 

    Regards,

    Jak

  • Hello Jak,

    I don't know about the Pro-services,  do they have a email surface to contact?

    for your suggestion:

    The iinstall package with -G parameter may be helpful to improve the group assignment workload, I will try it next time.

    My company sophos architecture :

    we use 3 SEC to provide sophos service:

    1)the 1st SEC is in DMZ :  only download sophosupdate from Sophos and distribute the sophosupdate to other 2 SEC servers(Windows server 2008 R2 & SEC 5.2.2);

    2)the 2nd SEC is in intranet , provide sophosupdate service to 100 SUM&RMS servers, manage about 70k clients (they are all server 2008 R2 , the SEC version is 5.2.2) , but only 30k clients can be connected at the same time;

    3)the 3rd SEC is in intranet VPN network zone , provide sophosupdate service to 3 SUM&RMS servers to manage 30K clients, (they are all server 2008 R2, the SEC version is 5.3.1),but only 10k client can be connected at the same time.

    the \router\ registry key for 3) servers is as following:

    SEC:

    [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Messaging System\Router]
    "NumORBThreads"=dword:00000010
    "ServiceDescription"="Message router for Sophos applications"
    "ServiceHomeDir"="D:\\Program Files\\Sophos\\Enterprise Console\\Remote Management System\\"
    "ParentAddress"=""
    "ParentPort"=dword:00002000
    "ConnectionCache"=dword:00005020
    "NumSenderThreads"=dword:00000008
    "ConnectRetriesPause"=dword:00000064
    "TotalConnectRetryTimeSecs"=dword:0000000a
    "GetterInterval"=dword:00000078
    "GetterShortInterval"=dword:00000078
    "NumNotificationThresholdThreads"=dword:00000004
    "IORSenderPort"=dword:00002000
    "ServiceArgs"="-ORBListenEndpoints iiop://:8193/ssl_port=8194"
    "NotifyRouterUpdate"="EM"
    "NotifyClientUpdate"="Router$TW17VW0308.Agent"
    "HostIPToParent"=dword:00000000

    And the 3 SUM&RMS servers are the same :

    [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Messaging System\Router]
    "ServiceDescription"="Message router for Sophos applications"
    "ServiceHomeDir"="C:\\Program Files (x86)\\Sophos\\Remote Management System\\"
    "ParentAddress"="10.116.218.61,fe80::24cb:b3a7:521d:1ef3,TW17VW0308.TW.com,TW17VW0308"
    "ParentPort"=dword:00002000
    "ConnectionCache"=dword:0000000a
    "NumSenderThreads"=dword:00000003
    "NumORBThreads"=dword:00000004
    "IORSenderPort"=dword:00002000
    "ServiceArgs"="-ORBListenEndpoints iiop://:8193/ssl_port=8194"
    "NotifyRouterUpdate"="EM"
    "HostIPToParent"=dword:0a76bc06
    "NotifyClientUpdate"="Router$TW16vw0010:1117342.Agent"

     

    Now I wonder whether this Sophos product can support so many clients (100K)?

    If we upgrade our server OS and SEC to server 2012 R2 and SEC 5.4, can it improve the performance of sophos service in my company? Thank you.

     

  • Hi,

    Thank you for the information.  Based on that, the reason for the problem is the registry values of the SUM and RMS servers:

    And the 3 SUM&RMS servers are the same :

    [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Messaging System\Router]
    "ConnectionCache"=dword:0000000a should be 5020
    "NumSenderThreads"=dword:00000003  should be 8
    "NumORBThreads"=dword:00000004   should be 10

    See the SEC server for the values a message relay should have.

    Essentially the relays have the router configuration of a client.  The RMS software is the same, the configuration makes it a server class or client class router.

    The connectioncache is a way to manage the number of connections the router sustains before recycling old connections for new ones.  On a client, the only things connecting to the router are the local agent and the upstream router, hence 10.  On a relay/SEC server the number of connections are in the thousands potentially depending on the number of clients.

    The number of ORB threads is also very important these are the work houses of the router if you like that deal with message handling.  A similar thing with numsenderthreads.

    It will "work" but it will cause slow message delivery and constantly closing connections, and the results will be something similar to the symptoms you have.

    How were these relays installed?  It's the job of the installer helper tool clientmrinit.exe to set these values appropriately based on the values in mrinit.conf.

    When setting up relays, you typically create a new distribution point for the relay (or install a SUM on it to generate a local distribution point) and the clients that are going to message through it.  You then copy the mrinit.conf to the rms sub directory of the distribution point and update the ParentRouterAddress value to be the address of the relay computer.  You then run configcid on the distribution point to add the custom mrinit.conf to the cidsync.upd and create a custom manifest file.

    When the clients (AutoUpdate) pull down this new mrinit.conf file from the rms sub directory of the distribution point, RMS is reinstalled.  Clientmrinit.exe runs, sees the change to mrinit.conf, backs up the original mrinit.conf pulled down from the root of the distribution point by setup as mrinit.conf.orig. For the relay machine, when clientmrinit.exe runs it sees the new ParentRouterAddress value contains its own address and the computer is configured as a relay with the 'upgraded' registry values.  It uses the the value in MRParentAddress for it's ParentAddress to find the upstream router, which is typically the management server. For the clients updating from the same distribution point; the ParentRouterAddress does not match their own details and they configure themselves just to point at the relay (ParentRouterAddress) as the ParentAddress is set to the ParentRouterAddress. Their registry keys match those of a client router.

    See: https://community.sophos.com/kb/en-us/14635 for guidance.

    As a test you can just update the registry keys on the SUM/Relay routers with the correct values for a server class router and restart the Router service.  This will work until RMS is updated, at which point the will be reverted.  RMS doesn't get updated very often (I'd say every 6 months on average??) so I would suggest for the short term just manually changing the 3 values highlighted aove and see how things behave for the next couple of weeks.  If all is well configure the relays as above to ensure the settings are permanent and can survive an RMS update.

    Hope it helps.

    Regards,
    Jak

Reply
  • Hi,

    Thank you for the information.  Based on that, the reason for the problem is the registry values of the SUM and RMS servers:

    And the 3 SUM&RMS servers are the same :

    [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Messaging System\Router]
    "ConnectionCache"=dword:0000000a should be 5020
    "NumSenderThreads"=dword:00000003  should be 8
    "NumORBThreads"=dword:00000004   should be 10

    See the SEC server for the values a message relay should have.

    Essentially the relays have the router configuration of a client.  The RMS software is the same, the configuration makes it a server class or client class router.

    The connectioncache is a way to manage the number of connections the router sustains before recycling old connections for new ones.  On a client, the only things connecting to the router are the local agent and the upstream router, hence 10.  On a relay/SEC server the number of connections are in the thousands potentially depending on the number of clients.

    The number of ORB threads is also very important these are the work houses of the router if you like that deal with message handling.  A similar thing with numsenderthreads.

    It will "work" but it will cause slow message delivery and constantly closing connections, and the results will be something similar to the symptoms you have.

    How were these relays installed?  It's the job of the installer helper tool clientmrinit.exe to set these values appropriately based on the values in mrinit.conf.

    When setting up relays, you typically create a new distribution point for the relay (or install a SUM on it to generate a local distribution point) and the clients that are going to message through it.  You then copy the mrinit.conf to the rms sub directory of the distribution point and update the ParentRouterAddress value to be the address of the relay computer.  You then run configcid on the distribution point to add the custom mrinit.conf to the cidsync.upd and create a custom manifest file.

    When the clients (AutoUpdate) pull down this new mrinit.conf file from the rms sub directory of the distribution point, RMS is reinstalled.  Clientmrinit.exe runs, sees the change to mrinit.conf, backs up the original mrinit.conf pulled down from the root of the distribution point by setup as mrinit.conf.orig. For the relay machine, when clientmrinit.exe runs it sees the new ParentRouterAddress value contains its own address and the computer is configured as a relay with the 'upgraded' registry values.  It uses the the value in MRParentAddress for it's ParentAddress to find the upstream router, which is typically the management server. For the clients updating from the same distribution point; the ParentRouterAddress does not match their own details and they configure themselves just to point at the relay (ParentRouterAddress) as the ParentAddress is set to the ParentRouterAddress. Their registry keys match those of a client router.

    See: https://community.sophos.com/kb/en-us/14635 for guidance.

    As a test you can just update the registry keys on the SUM/Relay routers with the correct values for a server class router and restart the Router service.  This will work until RMS is updated, at which point the will be reverted.  RMS doesn't get updated very often (I'd say every 6 months on average??) so I would suggest for the short term just manually changing the 3 values highlighted aove and see how things behave for the next couple of weeks.  If all is well configure the relays as above to ensure the settings are permanent and can survive an RMS update.

    Hope it helps.

    Regards,
    Jak

Children
No Data