Sophos Enterprise Console Server Not listening on 8194 so clients are reporting as offline

Hello All

 

Since attempting to upgrade our customers pre-production Sophos 5.5.1 SEC, SUM and DB to 5.5.2 Ive been having no end of problems. Following been finally able to access the SEC via the creation of the IORSenderPort registry key and the unticking of "Enable Auditing" under "Manage Auditing" after re-running the installer on the SEC with reset creds for the service accounts (we didn't seem to have the correct creds for their pre-prod only live) I still have a situation where only the SEC server itself shows as online. All clients and the additional SUM server are reporting themselves as offline (Red cross next to the client name).

On the SUM the in the Remote Management System > Network report folder the following is been displayed on the report, I believe where it says not avlaiable for RMS it should show the SEC ip address for FQDN:

 

In the logs under Remote Management System on the SUM I see this constantly reported:

09.09.2020 07:02:56 12A0 I Getting parent router IOR from "SEC IP Address" :8192
09.09.2020 07:02:56 12A0 I This computer is part of the domain "Domain Name"

09.09.2020 07:02:56 12A0 I Getting a new router certificate...
09.09.2020 07:03:38 12A0 E Router::GetCertificate: Caught CORBA system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO

09.09.2020 07:03:38 12A0 W Failed to get certificate, retrying in 600 seconds

Looking at the same report and logs on a client/endpoint I see the following reported:

 

09.09.2020 06:01:00 2AD0 I Getting parent router IOR from "SUM IP Address":8192
09.09.2020 06:01:21 2AD0 I Getting parent router IOR from "SUM FQDN":8192
09.09.2020 06:01:42 2AD0 I Getting parent router IOR from"SUM HOSTNAME":8192
09.09.2020 06:02:03 2AD0 E Failed to get parent router IOR
09.09.2020 06:02:03 2AD0 W Failed to get certificate, retrying in 600 seconds

The client lookings to be pointing to the SUM rather than SEC looking at the ParentAddress registry key under HKLM>Softwaresophos>Messaging System>Router and I also notice the ServiceArgs key is set to 8190 rather than 8194 like the SUM -  "-ORBListenEndpoints iiop://:8193/ssl_port=8190"

From what I have read you should be able to connect to porta 8192 and 8194 on the SEC but while I can connect to 8192 I cannot connect to 8194 as it is not listening on this port but I have confirmed the firewall rule for for allowing incomming to 8190-8194 is enabled.

Firewall rule

It appears to me that for some reason the SUM server cannot connect to the SEC server and due to the clients communicating with the SUM Server rather than the SEC server they are not communicating in.

Parents
  • Hello SimpleTechie,

    there seems to be a lot of things that aren't correct.

    Question: client lookings to be pointing to the SUM rather than SEC - so there is a "child" SUM in addition to the SEC server, why, what is the intended setup? All endpoints updating from a share/WebCID on the SUM instead of the SEC, and endpoints should communicate via the SUM, i.e. it is set up as a message relay?

    Anyway, the SUM should be able to communicate with the SEC. Apparently the SUM gets a valid IOR when using the HOSTNAME but neither with the IP nor the FQDN. This is a little bit strange and suggests that the MRParentAddress values in mrinit.conf are not ideal (from the SUM's POV).
    There's a delay of 40 seconds after the Getting a new router certificate. Not sure if this is a timeout. Please check the Router and the CertificationManager logs on the SEC server, they should give some insight what's going on.

    As the (apparent) message relay does not have a certificate its router is not in a state to commence communication, therefore the endpoints are unable to connect to the SUM's Router. BTW: They similarly fail to obtain an IOR using IP or FQDN.
    Someone must have deliberately change the port to 8190 for whatever reason, either by means of mrinit.conf or "manually" by modifying the registry keys. 

    Last but not least: RouterNT.exe should listen on 8192-8194 (unless someone has modified the ports but anyway should be three).

    Christian       

  • Hi

    Thank you for the advise I had alook at the CertificationManager logs and the Router logs on the SEC but find anything of use, while the errors are occurring on the child SUM all the time that last update on the SEC CertificationManager LOG is 08/09

    26.08.2020 12:44:49 09D0 I SOF: C:\ProgramData/Sophos/Remote Management System/3/CertificationManager/Logs/CertManager-20200826-114449.log
    26.08.2020 12:44:49 09D0 I [CertMgr]Certification Manager starting...
    26.08.2020 12:44:52 09D0 I [CertMgr]Certification Manager started
    26.08.2020 12:44:52 09D0 I [CertMgr]Enabling request processing
    26.08.2020 12:44:52 0A14 I InitialiseClientLibraryLocal CM, SOFTWARE\Sophos\Certification Manager\MessengerStore, CMConfig.reg, 0, ...
    26.08.2020 12:44:53 0A14 I Initializing ...
    26.08.2020 12:44:53 0A14 I [Msgr:RM]Logged on to Message Router
    08.09.2020 08:28:10 0A14 I [Msgr:RM]Lost session with Message Router:err=system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
    OMG minor code (2), described as '*unknown description*', completed = NO
    08.09.2020 08:28:12 0A14 N [Msgr:RM]Logged off Message Router
    08.09.2020 08:28:17 0A14 I Initializing ...
    08.09.2020 08:28:17 0A14 I [Msgr:RM]Logged on to Message Router
    08.09.2020 16:28:53 0A14 I [Msgr:RM]Lost session with Message Router:err=system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
    OMG minor code (2), described as '*unknown description*', completed = NO
    08.09.2020 16:28:55 0A14 N [Msgr:RM]Logged off Message Router
    08.09.2020 16:29:00 0A14 I Initializing ...
    08.09.2020 16:29:01 0A14 I [Msgr:RM]Logged on to Message Router
    08.09.2020 16:40:33 0A14 I [Msgr:RM]Lost session with Message Router:err=system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
    OMG minor code (2), described as '*unknown description*', completed = NO
    08.09.2020 16:40:35 0A14 N [Msgr:RM]Logged off Message Router
    08.09.2020 16:40:40 0A14 I Initializing ...
    08.09.2020 16:40:41 0A14 I [Msgr:RM]Logged on to Message Router

    The Router log on the SEC isn't showing an issues either and the RouterNT.exe service really does look to be only listening on 1 port wehich seems strange.

    Im trying to query our teams on how this was all setup, from what I can see and looking at their production environment the child SUM would pull updates from the SEC. 

     

    Looking at the mrinit from the SUM that would have been pulled from the SEC when it could connect which I presume was before the upgrade the ports differ to the mrinit on the SEC and the mrinit  on a client appears to match the SUM one rather than the SEC one

    SEC mrinit 

    Sophos\Update Manager\CIDS\S000\SAVSCFXP


    "ClientIIOPPort"=dword:00002001
    "ClientSSLPort"=dword:00002002
    "ClientIORPort"=dword:00002000
    "MRParentAddress"="'SEC IP','SEC FQDN,'SEC HOSTNAME'"
    "ParentRouterAddress"="'SEC IP','SEC FQDN,'SEC HOSTNAME'"

    00002001 - 8193
    00002002 - 8194
    00002000 - 8192

    SUM mrinit (Appears to be before the upgrade:

    Sophos\Update Manager\CIDS\S000\SAVSCFXP


    "ClientIIOPPort"=dword:00002001
    "ClientSSLPort"=dword:00001FFE
    "ClientIORPort"=dword:00002000
    "IORSenderPort"=dword:00002000
    "MRParentAddress"="'SEC IP','SEC FQDN,'SEC HOSTNAME'"
    "MRParentAddress"="'SEC IP','SEC FQDN,'SEC HOSTNAME'"

    00002001 - 8193
    00002002 - 8190
    00002000 - 8192

    Client side


    "ClientIIOPPort"=dword:00002001
    "ClientSSLPort"=dword:00001FFE
    "ClientIORPort"=dword:00002000
    "IORSenderPort"=dword:00002000
    "MRParentAddress"="1'SEC IP','SEC FQDN,'SEC HOSTNAME'"
    "ParentRouterAddress"="'SUM IP','SUM FQDN,'SUM HOSTNAME'"

    00002001 - 8193
    00002002 - 8190
    00002000 - 8192

    It does seem that maybe 8190 was been used instead of 8194 but as the RouterNT.exe isn't even listening on 8194 Im guessing that the focus point to start with and it Im not sure on how to get it to start listening on that port.

    Also I was able to compareHKLM>SOFTWARE>WOW6432Node>Sophos>Message System>Router on the Pre-Prod SEC (The one having the issue) and our customers Production SEC (WOrking and not connecting to Pre-Prod) and there are many registry keys present in production that are not in pre-prod:

    Not in pre-prod

    ConnectRetriesPause
    GetterInterval
    GetterShortInterval
    HostIPToParent
    LegacyProtocolSupport
    NotifyClientUpdate
    NumNotificationThresholdThreads
    ServiceArgs
    TotalConnectRetryTimeSecs

    I have Support involved but Im wondering if the best course of action right now would be to wipe everything completely and reinstall the SEC and SUM afresh.

  • Hello SimpleTechie,

    the best course of action right now would be to wipe everything completely and reinstall
    while one can learn quite a lot by trying to get this right an install from scratch isn't a bad idea. Wonder how CM could have [Msgr:RM]Logged on to Message Router when the Router isn't listening on 8194. And normally RouterNT should listen on ports 8192-8194 (and no other). Just for completeness - several management processes establish loopback (127.0.0.1) connections, both intra- and interprocess, on ephemeral ports.

    Christian

Reply
  • Hello SimpleTechie,

    the best course of action right now would be to wipe everything completely and reinstall
    while one can learn quite a lot by trying to get this right an install from scratch isn't a bad idea. Wonder how CM could have [Msgr:RM]Logged on to Message Router when the Router isn't listening on 8194. And normally RouterNT should listen on ports 8192-8194 (and no other). Just for completeness - several management processes establish loopback (127.0.0.1) connections, both intra- and interprocess, on ephemeral ports.

    Christian

Children
No Data