This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Replacing SUM server in SEC

 Hi. Would appreciate any advice on the following.

We have a setup with a SEC server and approx 6 remote SUM servers. As one of these SUM servers was a 2008 server and we are migrating to 2012 we decided to replace it. This was done by creating a new 2012 server and giving it a temp name and IP. We then renamed and re-IP's the old 2008 server and gave the original name and IP to the new 2012 server. This seems to have worked OK and the clients seem to be taking their updates OK from the new 2012 server. However i think I still have an issue with the messaging service. I have noticed a large build up of messages in the envelopes folder on the SEC. Messages still seem to be moving through the folder (possible form the existing SUM servers) but there is still an increase in size. I think a lot of these could be messages for either the old 2008 physical box or the new 20112 server temporary name neither of which exists anymore (if this makes sense!). I have tried rebooting both the SEC and the SUM with no effect. I have also tried stopping and staring the Update manager service on the SEC as well as the Message Router service with no effect. Finally I have tried renaming the Envelopes folder but the size builds up again. I was going to delete the servers from the SEC and add in the correct one again but having read other posts on this forum have held off from doing this. I am confident this is not an actual updating issue and more of a messaging issue. Grateful if you could confirm I am right in my assumptions and also of a course of action.



This thread was automatically locked due to age.
Parents
  • Hello Ian Withers,

    do I understand correctly that your SUMs are also Message Relays?
    Please take one of the "stuck" messages (it should have [Originator] Router$yourSECserver.EM in it) and check the [Destination], a relayed message looks like this: Router$relay:number1.Router$endpoint:number2.Agent. Guess it's obvious what relay should be. The next steps depend on what you find.

    Christian

  • Hi Christian.

    You are indeed right in that the SUM servers are also message relay servers. I have checked the config an the new SUM server and it shows as a Message Relay OK. In the stuck message on the SEC server I can see in the [Destination] Router$<servername>:123456.Router$<clientname>:123456.Agent.

    I notice when checking the Sophos Network Communications reports on the SUM server it gives the new server name but the RMS Router name is the temporary one we 'christened' the server with originally. I read in one of your other articles that this probably is not an issue but I can't help wondering if the messaging system is confused. The RMS>3>Router>message logs on the SEC all show messages going to all the SUM servers and in the case of this particular SUM server the name is the old original one it was created with and not the current name.

     

    Thanks again for your help.

    With best wishes,

    Ian

  • Hello Ian,

    show messages going to all the SUM servers
    you mean the Routing to Router$<servername>:123456, id=0123ABCD, ...., dest=Router$<servername>:123456.Router$<clientname>:987654.Agent? Please note that only a subsequent Supplying message (id=0123ABCD) to Router$<servername>:123456  indicates that the message has been sent, otherwise you'll find the corresponding .msg in the \Envelopes folder (in it the
    MessageID is the decimal representation of the id=).

    the RMS Router name is the temporary one
    correct, it shouldn't matter.
    When an endpoint (I prefer the term endpoint to client) logs on to them management server (EM) through a relay the relay prepends its router name to the endpoint's address, subsequently when EM sends a message to the endpoint it uses the prepended name to choose the appropriate the relay.
    You say that messages build up. Normally EM doesn't send many messages to, especially disconnected, endpoints unless you perform some action (e.g. updating/assigning policies, requesting a scan) in the console. Are the destinations in the messages endpoints (specifically endpoints behind this MR), is the relay address the new SUM/MR, and what's the [Type] of these messages? What's the Policy compliance and Up to date status of these endpoints?
    Even if the old server is still online, configured as an MR, and chosen by some endpoints this shouldn't cause problems as the path is updated when the connection is initiated by the endpoints and the backward path should subsequently be available. Only when EM sends a message to a disconnected endpoint (see above) it uses the last-known path which might not be available when the endpoint later connects again.

    Christian   

Reply
  • Hello Ian,

    show messages going to all the SUM servers
    you mean the Routing to Router$<servername>:123456, id=0123ABCD, ...., dest=Router$<servername>:123456.Router$<clientname>:987654.Agent? Please note that only a subsequent Supplying message (id=0123ABCD) to Router$<servername>:123456  indicates that the message has been sent, otherwise you'll find the corresponding .msg in the \Envelopes folder (in it the
    MessageID is the decimal representation of the id=).

    the RMS Router name is the temporary one
    correct, it shouldn't matter.
    When an endpoint (I prefer the term endpoint to client) logs on to them management server (EM) through a relay the relay prepends its router name to the endpoint's address, subsequently when EM sends a message to the endpoint it uses the prepended name to choose the appropriate the relay.
    You say that messages build up. Normally EM doesn't send many messages to, especially disconnected, endpoints unless you perform some action (e.g. updating/assigning policies, requesting a scan) in the console. Are the destinations in the messages endpoints (specifically endpoints behind this MR), is the relay address the new SUM/MR, and what's the [Type] of these messages? What's the Policy compliance and Up to date status of these endpoints?
    Even if the old server is still online, configured as an MR, and chosen by some endpoints this shouldn't cause problems as the path is updated when the connection is initiated by the endpoints and the backward path should subsequently be available. Only when EM sends a message to a disconnected endpoint (see above) it uses the last-known path which might not be available when the endpoint later connects again.

    Christian   

Children
  • Hi Christian,

    Thanks for this. Sorry about the delay but we had a separate issue last night with messages sending which give us a big backlog of messages to send. This appears to be resolved now so I have been able to look at your latest.

    Most of the stuck messages are endpoints behind the SUM in question but there are a minority which are direct to endpoints. Some messages are to an endpoint behind two SUM i.e <SUM1>:123456>SUM2>.123456<RouterENDPOINT>.Agent. The types of message is EM-SetConfiguration. There are over 20,000 messages in the envelope so I have only looked at a selection. However the above seems to apply to all the stuck message.

     

    Hope this helps.

  • Hello Ian,

    if the SUM in question isn't one of the chained SUMs I'll disregard this for the moment.

    So it seems to be and issue with this SUM, IIRC EM-SetConfiguration are policy messages. How many endpoints behind this relay?

    • is the SUM (in the Endpoints view) connected and its Last message time recent? Same question for the endpoints behind it
      or search the Router log on the SEC server for relayed messages (using origin=Router$SUM:123456.Router$)
    • does netstat -n| find "<MR-IP>"  show the expected two connections

    on the SUM/MR

    • does the Router log on the SUM/MR show traffic
    • does netstat -n|find ":8194" show connected endpoints

    If there are no endpoint connections it might be a missing firewall rule

    Christian

  • Hi Christian,

    Just double checking everything again and i notice the mrinit.conf that I put in the RMS folder for each CID is missing. However the endpoints do seem to be pointing at the SUM OK.

    I'll re-do the mrinit files for each CID and run Confcid.exe again and let you know how it goes.

     

    With thanks,

  • Hi again Christian,

    To answer these questions:

    The SUM is connected(green pin), Showing same as Policy and Up to Date (Yes). 

    There are many endpoints behind this SUM including my laptop so i shall use that. It is showing as connected, awaiting policy transfer and Up to Date - Yes.

    In the logs on the SEC Server there are multiple Lines as shown Below (please note SUM Server 2 is not having any issues)

    12.09.2017 13:08:05 1E68 I Routing to Router$<SUM Server>:684156: id=09B51902, origin=Router$<SEC Server>.EM, dest=Router$<SUM Server>:684156.Router$<SUM server 2>:605119.Router$<endpoint>:604816.Agent, type=EM-SetConfiguration

     

    I can see numerous connections on the SUM server using port 8194.

     

    Finally there is traffic in the logs on the SUM/MR. Mainly two different entries as shown below:

    12.09.2017 13:33:54 0DF4 I Routing to parent: id=01B7CE22, origin=Router$<endpoint>:609354.Agent, dest=EM, type=EM-GetStatus-Reply
    12.09.2017 13:33:54 0DF4 I Routing to parent: id=07B7CF2C, origin=Router$<SUM Server>:611235, dest=EM, type=EM-RouterLogon

    Hope this helps. Much appreciated.

  • Hello Ian,

    my laptop ... connected
    the Last message time fairly recent I assume - otherwise it wouldn't show as Up to date.

    Routing to ... id=09B51902
    Is there a corresponding Supplying message (id=09B51902) entry?

    type=EM-GetStatus-Reply
    are
    status messages from the endpoints to the management server (EM)
    type=EM-RouterLogon
    informs EM that an endpoint has established a connection

    You probably have lots of traffic but nevertheless it should be possible to track coomunication with your laptop. From the console request Update now, a message of type=EM-DoAction should get routed via the SUM to your laptop, check if it is followed by a corresponding Supplying afterwards. Eventually your laptop should check for updates and send a status back to EM.
    If all this works as expected you should simply delete all message older than, say an hour. Any messages that build up subsequently should be inspected.

    If there's no Supplying then, well, frankly I'd have to think about it.

    Christian 

  • Hi Christian,

    Trying to get my head round these logs.

    On the SEC I can see some supplying mesages. However there dont seem to be that many compared with the amount of routing to messages. The ones I can trace seem to be communicating form the SEC server to the SUM (both the routing to and the supplying messages) plus they seem to be over an hour apart.

    I also tried an update from my laptop and could not see any corresponding message on either the SUM or SEC. On closer inspection most of the messages on the SUM are inter-SUM messages. There are a lot of RouterLogon and GetStatus-Reply messages. I have also seen a bunch of sent messages which only have an id and a destination of the SEC server so I am about to start tracing those.

    I will come back to you with the results of this but it looks like the messages are not flowing as they should

  • Hi Christian,

    Apologies for the delay but I wanted to be sure what was happening and wanted to let things run over the weekend to see if the issue had cleared. 

    Firstly I had previously tried stopping and starting the following 3 services:

     

    • Sophos Management Service
    • Sophos Certification Manager
    • Sophos Message Router

    However a colleague restarted the following 6 services:

    Update Manager

    Sophos message Router

    Sophos management service

    Sophos management Host

    Sophos Agent

    It looked like this had been effective as the messages started to empty consistently from the envelopes folder. I left it over the weekend and it went from 43000 to just over 2000. I then deleted some of the older messages which left it at around 200 messages.. All the time we could see messages entering and leaving the folder. However I am not convinced even now that all is well. You my recall I was going to test me own endpoint so i did a 'comply with all group policies' on the console against my own endpoint. The result is that it now shows Policy Compliance as Comparison failure. I could see the messages  on the SEC server and have attached the full exchange (sanitised) in the hope it would make sense to you - there does seem to be some errors. The notation I have used is fairly obvious SECSERVER, NEWSUMServer and MY_ENDPOINT. Where other endpoints or SUM servers feature I have indicated it accordingly. You can see Routing to, Supplying and Sent messages. There are also some errors.

     

    18.09.2017 14:29:26 05CC I Routing to Router$NEWSUMSERVER:611235: id=01BFCA36, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:26 08A8 I Supplying message (id=01BFCA36) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:26 05CC I Routing to Router$NEWSUMSERVER:611235: id=03BFCA36, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:26 0890 I Supplying message (id=03BFCA36) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:26 05CC I Routing to Router$NEWSUMSERVER:611235: id=05BFCA36, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:26 08A8 I Supplying message (id=05BFCA36) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:27 05CC I Routing to EM: id=01BFCA37, origin=Router$NEWSUMSERVER:611235.Router$OTHER_ENDPOINT:711468.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:27 17DC I Sent message (id=01BFCA37) to EM

    18.09.2017 14:29:27 05CC I Routing to EM: id=01BFCA37, origin=Router$NEWSUMSERVER:611235.Router$OTHER_ENDPOINT_2:607677.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:27 17E4 I Sent message (id=01BFCA37) to EM

    18.09.2017 14:29:28 05CC I Routing to Router$NEWSUMSERVER:611235: id=01BFCA37, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 05CC I Routing to Router$NEWSUMSERVER:611235: id=01BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 05CC I Routing to Router$NEWSUMSERVER:611235: id=03BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 05CC I Routing to Router$NEWSUMSERVER:611235: id=05BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 05CC I Routing to Router$NEWSUMSERVER:611235: id=07BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 0890 I Supplying message (id=01BFCA37) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:28 05CC I Routing to Router$NEWSUMSERVER:611235: id=09BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 08A8 I Supplying message (id=01BFCA38) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:28 05CC I Routing to EM: id=01BFCA38, origin=Router$NEWSUMSERVER:611235.Router$OTHER_ENDPOINT_02:675266.Router$OTHER_ENDPOINT_02:609488.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:28 17DC I Sent message (id=01BFCA38) to EM

    18.09.2017 14:29:28 05CC I Routing to EM: id=01BFCA38, origin=Router$NEWSUMSERVER:611235.Router$$OTHER_SUMSERVER_01.Router$OTHER_ENDPOINT_3.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:28 17E4 I Sent message (id=01BFCA38) to EM

    18.09.2017 14:29:28 08A8 I Supplying message (id=03BFCA38) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:28 08A8 I Supplying message (id=05BFCA38) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:28 0890 I Supplying message (id=07BFCA38) to Router$NEWSUMSERVER:611235

    18.09.2017 14:29:28 08A8 E ACE_SSL (2108|2216) error code: 336027900 - error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol

    18.09.2017 14:29:28 08A8 E ACE_SSL (2108|2216) error code: 336462231 - error:140E0197:SSL routines:SSL_shutdown:shutdown while in init

    18.09.2017 14:29:28 0890 E ACE_SSL (2108|2192) error code: 336027900 - error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol

    18.09.2017 14:29:28 0890 E ACE_SSL (2108|2192) error code: 336462231 - error:140E0197:SSL routines:SSL_shutdown:shutdown while in init

    18.09.2017 14:29:28 0890 I Supplying message (id=09BFCA38) to Router$NEWSUMSERVER:611235

     

    I appreciate it if you could cast an eye over it. If you are happy enough i can then upload sanitised logs from the New SUM Server if you like.

     

  • Hello Ian,

    as to the SSL errors - these might be endpoints with outdated installations, unless you keep an eye on every endpoint there are always some with installation or updating issues. Can't remember if a verbose router log would reveal the offending IP.

    Comparison failure
    is IIRC the result of some internal communication issue on the endpoint. You can turn on verbose logging for the Sophos Agent service on your laptop, dunno if it will give any insight though. After setting the registry key and restarting the Agent service request Comply with for the policy (often AV) with the failure, then check the Agent log. As you see, communication with the management server is working.

    Christian

  • Hi Christian,

     

    Quick update to this.

    Firstly I have removed the error against my endpoint simply by initiating check for updates from my laptop. Therefore it seems there is at least communication upward it seems.

    Just for completeness I have extracted and sanitised the corresponding logs on our new SUM Server to the communication as provided in my previous post. There seems to be so sign of the previous errors.

     

    18.09.2017 14:29:26 12A8 I Routing to Router$MY_ENDPOINT:608190: id=01BFCA36, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:26 0660 I Supplying message (id=01BFCA36) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:26 12A8 I Routing to Router$MY_ENDPOINT:608190: id=03BFCA36, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:26 0660 I Supplying message (id=03BFCA36) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:26 12A8 I Routing to Router$MY_ENDPOINT:608190: id=05BFCA36, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:26 043C I Supplying message (id=05BFCA36) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:27 12A8 I Routing to parent: id=01BFCA37, origin=Router$OTHER_ENDPOINT_01:711468.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:27 12A4 I Sent message (id=01BFCA37) to Router$SECSERVER

    18.09.2017 14:29:27 12A8 I Routing to parent: id=01BFCA37, origin=Router$OTHER_ENDPOINT_02:607677.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:27 06C0 I Sent message (id=01BFCA37) to Router$SECSERVER

    18.09.2017 14:29:28 12A8 I Routing to Router$MY_ENDPOINT:608190: id=01BFCA37, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 12A8 I Routing to Router$MY_ENDPOINT:608190: id=01BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 12A8 I Routing to parent: id=01BFCA38, origin=Router$$OTHER_SUMSERVER_01:675266.Router$$OTHER_ENDPOINT_03:609488.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:28 0F5C I Sent message (id=01BFCA38) to Router$SECSERVER

    18.09.2017 14:29:28 12A8 I Routing to parent: id=01BFCA38, origin=Router$$OTHER_SUMSERVER_02:604092.Router$$OTHER_ENDPOINT_04:608532.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:28 0984 I Sent message (id=01BFCA38) to Router$SECSERVER

    18.09.2017 14:29:28 12A8 I Routing to Router$MY_ENDPOINT:608190: id=03BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 12A8 I Routing to Router$MY_ENDPOINT:608190: id=05BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 12A8 I Routing to Router$MY_ENDPOINT:608190: id=07BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 12A8 I Routing to Router$MY_ENDPOINT:608190: id=09BFCA38, origin=Router$SECSERVER.EM, dest=Router$NEWSUMSERVER:611235.Router$MY_ENDPOINT:608190.Agent, type=EM-SetConfiguration

    18.09.2017 14:29:28 099C I Supplying message (id=01BFCA37) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:28 12A8 I Routing to parent: id=01BFCA39, origin=Router$$OTHER_SUMSERVER_01:675266, dest=EM, type=EM-RouterLogoff

    18.09.2017 14:29:29 12A8 I Routing to parent: id=01BFCA39, origin=Router$CAH-ST011779:609679.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:29 12A8 I Routing to parent: id=01BFCA38, origin=Router$$OTHER_SUMSERVER_02:604092.Router$$OTHER_ENDPOINT_05:606434.Agent, dest=EM, type=EM-GetStatus-Reply

    18.09.2017 14:29:29 0930 I Sent message (id=01BFCA39) to Router$SECSERVER

    18.09.2017 14:29:29 0660 I Supplying message (id=01BFCA38) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:29 0930 I Sent message (id=01BFCA39) to Router$SECSERVER

    18.09.2017 14:29:29 0AAC I Supplying message (id=03BFCA38) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:29 0930 I Sent message (id=01BFCA38) to Router$SECSERVER

    18.09.2017 14:29:29 0AAC I Supplying message (id=05BFCA38) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:29 0660 I Supplying message (id=07BFCA38) to Router$MY_ENDPOINT:608190

    18.09.2017 14:29:29 0660 I Supplying message (id=09BFCA38) to Router$MY_ENDPOINT:608190

     

    Once again your help is appreciated.