Not receiving messages from endpoints after server migration

Hi,

Recently I completed a server to server migration of the Enterprise Console from Server 2003 32 bit to 2012 R2 64 bit as per this guide: https://www.sophos.com/en-us/support/knowledgebase/28276.aspx 

Everything looks to have gone well, however, a few days down the line and around half of the endpoints are reporting "Update to date: not since <2 days ago>". This time and date match the "Last message received from computer" column but when I look at the local computers in question, in Endpoint Security and Control I can see that they are recieving updates from their local SUM and are indeed up to date.

I'm not quite sure which logs to check here apart from the local updating log which shows everything is ok. 

I presume that I didn't allow enough time for one of the policies to propigate to all of the endpoints before decomissioning the old server.

Any help would be much appreciated!

Thanks

:57790
  • Hello hardies,

    they are receiving updates

    updates are downloaded from a UNC or HTTP location (set with the policy), messaging (the RMS component) uses its own mechanism (including the specification of the server location) so one might work while the other fails. 

    a server to server migration

    How did you redirect the endpoints to the new server? Is their local SUM also a message relay, is it updating "from itself" and is it messaging?

    which logs to check

    As said, AutoUpdate and RMS are different components. For RMS please check the router logs (or start with opening ReportData.xml in the ...\Router\NetworkReport\ folder).

    Christian

    :57803
  • Hi Christian

    Thank you for the reply.

    After checking the logs I'm getting lots of errors regarding communication to the message router. 

    Here is part of the RMS log: 

    02.07.2015 15:17:43 20A0 I SDDM:SCAPI Calling Connect...
    02.07.2015 15:17:43 20A0 I SDDMA: An uninitialized socket was created.
    02.07.2015 15:17:43 20A0 I SDDM:SCAPI: Connect succeeded.
    02.07.2015 15:17:43 20A0 I SDDMA: Logon key written successfully.
    02.07.2015 15:17:43 20A0 I SDDMA: Logon key sent.
    02.07.2015 15:17:43 20A0 I SDDMA: Socket connection authenticated.
    02.07.2015 15:17:43 10C0 I SDDMA: The adapter is connected to SDDM.
    02.07.2015 15:17:43 10C0 I SDDMA: Sending a Status Report upstream (forced)...
    02.07.2015 15:17:43 10C0 I SDDM state observer notified that SDDM is running
    02.07.2015 15:17:43 10C0 I SDDM state observer received a status: <?xml version="1.0" encoding="utf-8" ?><status xmlns="com.sophos\mansys\status" xmlns:csc="com.sophos\msys\csc" xmlns:xsi="www.w3.org/.../XMLSchema-instance&quot; type="sddm"><csc:CompRes policyType="9" Res="NoRef"/><csc:CompRes policyType="10" Res="NoRef"/><csc:CompRes policyType="11" Res="NoRef"/><csc:CompRes policyType="12" Res="NoRef"/><csc:CompRes policyType="13" Res="NoRef"/><version number="1"/><updateManager xmlns="www.sophos.com/.../common.xsd&quot; status="OK" softwareVersion="1.5.0"><updateOperation id="programsUpdate" lastNonNullFinishedAt="" lastFinishedAt="" /><updateOperation id="supplementsUpdate" lastNonNullFinishedAt="" lastFinishedAt="" /><defaultShare user="HARDIES\SophosUpdateMgr" password="redacted"/><currency></currency></updateManager></status>
    02.07.2015 15:17:43 10C0 I SDDMA: Status report dispatched.
    02.07.2015 15:17:43 10C0 I SDDMA: Sending a Status Report upstream (unthrottled)...
    02.07.2015 15:17:43 10C0 I SDDM state observer notified that SDDM is running
    02.07.2015 15:17:43 10C0 I SDDM state observer received a status: <?xml version="1.0" encoding="utf-8" ?><status xmlns="com.sophos\mansys\status" xmlns:csc="com.sophos\msys\csc" xmlns:xsi="www.w3.org/.../XMLSchema-instance&quot; type="sddm"><csc:CompRes policyType="9" Res="NoRef"/><csc:CompRes policyType="10" Res="NoRef"/><csc:CompRes policyType="11" Res="NoRef"/><csc:CompRes policyType="12" Res="NoRef"/><csc:CompRes policyType="13" Res="NoRef"/><version number="1"/><updateManager xmlns="www.sophos.com/.../common.xsd&quot; status="OK" softwareVersion="1.5.0"><updateOperation id="programsUpdate" lastNonNullFinishedAt="" lastFinishedAt="" /><updateOperation id="supplementsUpdate" lastNonNullFinishedAt="" lastFinishedAt="" /><defaultShare user="HARDIES\SophosUpdateMgr" password="redacted="/><currency></currency></updateManager></status>
    02.07.2015 15:17:43 10C0 I SDDMA: Status report dispatched.
    02.07.2015 15:17:44 1C3C W MSClient::Connect: failed to get router's IOR from supplied address and port.
    02.07.2015 15:17:44 1C3C E NoRouterIORException: Caught MSClient::Connect: failed to get router's IOR from supplied address and port.
     ClientConnection::Reconnect()
    
    02.07.2015 15:17:52 1C3C W MSClient::Connect: failed to get router's IOR from supplied address and port.
    02.07.2015 15:17:52 1C3C E NoRouterIORException: Caught MSClient::Connect: failed to get router's IOR from supplied address and port.
     ClientConnection::Reconnect()

     And from the ReportData.xml log:

    State of name resolution (DNS)  
    
    Problem description :
    There is a problem communicating with the server.  
    
    Overview :
    Failed to determine the IP address of the computer from its name. Communication cannot start until this problem is resolved.  
    
    Possible cause :
    DNS is misconfigured or the information is missing or incorrect.  
    
    Action to repair :
    Verify that the client can resolve the name of the server. Alternatively, use a static IP address on the server (this is the configuration recommended by Sophos).  
    
    More information can be found in the Sophos knowledgebase :
     Access the Sophos knowledgebase  
    
    State of Sophos security framework 
     
    No problems detected. 
     
    State of incoming communications from server 
     
    No problems detected. 
     
    State of outgoing communications to server 
     
    Problem description :
    Communication failure.  
    
    Overview :
    Failed to communicate with the server.  
    
    Possible cause :
    "Sophos Message Router" service may be stopped on the server, or the server may be disconnected from the network, or a firewall may be blocking communications from the client to the server.  
    
    Action to repair :
    Verify that the Sophos Message Router ports (by default 8192 and 8194) on the server are accessible by the computer with the problem. Also check networking and services on the server.  
    
    More information can be found in the Sophos knowledgebase :
     Access the Sophos knowledgebase  
    
    Computer details 
     
    Report generation time ( local time )
    02 July 2015 15:18:38 
    
    Report generation time ( GMT )
    02 July 2015 14:18:38 
    
    Computer name :
    HCSDUNFERMLINE  
    
    Windows domain :
    HARDIES  
    
    RMS router name :
    Not available  
    
    IOR port number :
    8192 
    
    SSLIOP port number :
    Not available  
    
    Parent addresses :
    192.168.20.3,fdbf:74b:b62:3333::1,eternium.hardies.local,eternium  
    
    Current parent address :
    Not available  
    
    RMS router type :
    endpoint  

    There are 8 SUMs connecting to the new server's Enterprise Console. Interestingly is that the local one is working correctly which sugests a networking issue, however, I have tried compleley disabling the firewall on both servers. I also tried tried installing a new temporary SUM which has never appeared in the console.

    I feel we're in the right area but I'm a little stuck now.

    Again, any help is very much appreciated.

    Thanks

    :57821
  • Hello hardies,

    according to the log it expects to be able to connect to port 8192 in any of these addresses/names and get an appropriate response. Apparently there is none and furthermore there seems to be no (DNS) resolution for the names.

    Parent addresses :
    192.168.20.3,fdbf:74b:b62:3333::1,eternium.hardies.local,eternium  
    
    Current parent address :
    Not available  

    Is or was eternium your server?

    Christian

    :57822
  • eternium is the new server which is part of the domain and resolves fine from anywhere on the network. How can I tell if those ports are open/correct?

    Thanks

    :57823
  • Hello hardies,

    when you "telnet eternium 8192" you shoul get the IOR: as response.
    And the interesting part of the RMS log is the startup - just restart the service, it will create a new log. The part up to where it complains about not getting an IOR:.

    Christian
    :57827
  • In reply to hardies:

    I know this is an old thread, but I wanted to reply in case somebody else runs into this issue:

    My colleague and I were able to resolve this issue.  The following is what we identified as the cause of the issue:

    1.  The ReportData.xml file did not include the management server's correct DNS suffix.  The management server was listed as Server.Contoso.com instead of Server.it.Contoso.com.  Manually correcting the ReportData.xml file allowed the Sophos client to check into the management console.

    2.  Alternatively, we found that updating the HOSTS file on the affected client with an entry for the Server tying it to its IP address allowed the client to check into the Sophos management console.

    Our "permanent" fix was to update the mrinit.conf file in the Sophos installation package so that the proper FQDN for the management server is listed.  Originally, the incorrect FQDN (Server.Contoso.com) was listed.

  • In reply to snoopfrog:

    Hi All,

     

    I am having a similar behavior on a multitude of clients. ReportData.xml will have exactly the same entries for:

    Windows Domain

    IOR port

    SSLIOP port

    Parent Address (ip of server, FQDN of server, NetBIOS name of server)

    Current Parent Address (ip of server)

    RMS router type: endpoint

    Yet it repots Incoming communication problem. All machines are in the same ad group and even the same group policy. All updating fine, and can telnet from them to the server, getting the IOR back.

    Article 17134 just says open ports, although I cannot telnet to neither these clients that have the error, nor to the ones that do not report any errors in the xml...

    Is it safe to ignore, or would I need to keep an eye on something else?

    Many thanks,

    DanZi

  • In reply to Daniel Gebler:

    Hello DanZi,

    it reports
    who's it? Is there a TCP connection from the endpoint to the server's 8194? You re-initialised RMS on these endpoints and other as well but the others send data?

    The Router logs should provide some insight.

    Christian

  • In reply to QC:

    Hi Christian,

     

    it reports = I mean the machines having this error do report to the console okay.

    Actually, RMS re-init was dropped, I simply reinstall the client. There is indeed a TCP connection from the client to the server. ESTABLISHED.

    Router log has only I entries... Heartbeat calls, whit success and RouterSystemCheck - portst 5. etc....

     

    DanZi

  • In reply to Daniel Gebler:

    Hello DanZi,

    now you've lost me, completely.
    In your previous post you said you see a similar behavior - but the OP is about endpoints not reporting and a failure of outgoing communications. Arguably Yet it reports Incoming communication problem is a similar behavior - insofar as there is an error I'd agree but IMO it's too broad.

    Haven't seen this error in a NetworkReport and can't say what triggers it. Though I'd assume that the Message Router writes a corresponding message to the log when it indicates the problem in the NetworkReport. As the Router writes the report at startup I'd restart the service, verify that the problems is still indicated, and check the Router log.

    Christian

  • In reply to QC:

    Hi Christian,

    That's a good one, I didn't know that the xml was written by the Message Router service. A remote restart from powershell does eliminate the issue in a lot of cases. I think RMS is just a little picky.

    Many thanks for the tip! :-)

    DanZi