This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SEC missing computer details and unable to apply policies

We recently had a hard drive failure on our Windows 2003 Sophos Enterprise Console 4.5 server and discovered that our backup was corrupt.  The server has been rebuilt (same host/doman and IP address) and I've installed SEC 4.7.

Existing client computers are unable to pickup definition updates.  I can see them SEC, but am not able to not collect computer details or send them policies (greyed out).

On one client computer I uninstalled the Antivirus, Updater and Management components, restarted, then reinstalled using a command line to the Sophos server share.  It is able to pickup updates, but no computer details are showing and I am unable to send it policies.

Windows Group Policy is opening the three standard Sophos ports (UDP & TCP) on the client computers.

:15357


This thread was automatically locked due to age.
  • Hello CFB,

    and discovered that our backup was corrupt  

    bad - corrupt or restore can't make sense of it ... although I often say backup your data I usually leave out the and make sure you can restore them for the sake of brevity. It pays to make application specific backups in addition and use a mechanism independent of the server backup. In case of SEC it's some (static) registry settings (see the  Enterprise console migration guide chapter 6.1.3 for details) and the database. Just putting them on another (physical) server gives some extra safety (some time ago a finger slip zilched most of the database contents - thanks to the daily database backup all I lost was a few hours worth of status messages/alerts, the rest was restored within minutes).  But I digress ...

    Did you import your clients from AD (as you obviously "see" them)? If you didn't have the registry keys for the Certification Manager available the new server created new certificates for use with RMS. As the clients expect a different certificate they refuse to talk to the server (if you import the mentioned keys before install the server would in terms of RMS be "the same" - even if you change it's name and/or IP).

    Now it is not necessary to uninstall and reinstall on the clients (and that is doesn't work might be because there's a left over mrinit.conf.orig). The migration guide shows one possible way in chapter 6.1.5. Dunno - as I haven't tested it - whether reinit.bat is only called when at least one of the ScriptFiles is updated and copied or with every update. In the latter case (as you don't have an old CID with these files and a new CID without them) you'll want to add some code so it performs the reinit only once (it'd be a good idea to set a version-specific marker in case you ever again need it).  

    RMS uses TCP - no need to opne UDP.

    HTH

    Christian

    :15359
  • HI,

    If you want to try it, I wrote a HTA tool which can create a vbscript to re-initialise clients in terms of RMS.  This will get the clients a new cac.pem and force them to re-ask for their certs.

    /search?q= 8939

    Any questions, please write back.

    Regards,

    Jak

    :15363
  • Hi Jak,

    I used your RMSReinit.hat, to generate Reinit.vbs.  Cac.pem & Mrinit.conf come from "C:\Program Files\SophosEnterprise Console\SUMInstaller".  I then ran "cscript RMSReinit.vbs", using "Run-as administrator" on a client (UAC off) and see that Cac.pem and Mrinit.conf have been replaced with newer versions.  Unfortunately, even after restarting the client,  "Comply with" policies is still greyed out in the console and the client isn't picking up updates.

    Likewise, a client computer that's had the 3 client apps removed and reinstalled is now picking up updates, but I'm also not able to make it comply with policies, before or after running your script.

    Colin

    :15643
  • Hi,

    After running the script, has the client received its certificates from the server?

    Note; ensure that the Certification Manager service on the server is started.

    The client should have the following 4 registry keys:

    Router:
    HKEY_LOCAL_MACHINE\SOFTWARE\[Wow6432Node]\Sophos\Messaging System\Router\Private \pkc
    HKEY_LOCAL_MACHINE\SOFTWARE\[Wow6432Node]\Sophos\Messaging System\Router\Private \pkp 

    Agent:
    HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Remote Management System\ManagementAgent\Private \pkc

    HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Remote Management System\ManagementAgent\Private \pkp

    Without these keys the client will be unable to message the server.

    Also as a check, on the server, the registry keys values for:

    HKLM\sotware\[wow6432node]\sophos\Certificaiton Manager\CerAuthStore\DelegatedManagerKey

    HKLM\sotware\[wow6432node]\sophos\Certificaiton Manager\CerAuthStore\ManagedAppKey

    HKLM\sotware\[wow6432node]\sophos\Certificaiton Manager\CerAuthStore\RouterKey


    should match the strings in the mrinit.conf that you seleted.  That is also worth checking to ensure the server and client are in sync with regards to the certificate identity keys.

    Whilst you're in this server key (certauthstore), the cac key value should match that on the client, which is stored in:
    HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Messaging System \cac

    Regards,

    Jak

    :15645
  • The "Sophos Certification Manager" service is running on the server.

    All 4 registry keys are missing on the client, after running the script.

    The three server registry key values match the strings in the mrinit.conf that I used in generating the script.  The cac key value matchs on the client and server.

    Thanks for your assistance.

    Colin

    :15655
  • Hi,

    The lack of certificates is the problem then.

    Are you able to paste here a router log from the client?

    In an ideal world the corresponding router log from the server also so we can see the client talking or not talking as it might be to the server?

    Restarting the router wil create a new log, to reduce the size, if you could restart the router and wait maybe 45 seconds that should be enough.

    I assume that the parentaddress value in the registry of the client is valid for the server? I.e.

    HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Sophos\Messaging System\Router \ParentAddress

    RouterNT.exe on the client can connect to TCP port 8192 and 8914 on the server?

    Ideally the server can connect to TCP port 8194 on the client?

    Can you telnet to port 8192 and 8194 of the server from the client?
    If so can you paste here the string returned from 8192? 


    Regards,

    Jak 

    :15659
  • Router log from client;
    15.08.2011 15:23:26 09FC I SOF: C:\ProgramData/Sophos/Remote Management System/3/Router/Logs/Router-20110815-222326.log
    15.08.2011 15:23:26 09FC I Sophos Messaging Router 3.3.0.2059 starting...
    15.08.2011 15:23:26 09FC I Setting ACE_FD_SETSIZE to 138
    15.08.2011 15:23:26 09FC I Initializing CORBA...
    15.08.2011 15:23:26 09FC I Setting connection cache limit to 10
    15.08.2011 15:23:26 09FC I Creating ORB runner with 4 threads
    15.08.2011 15:23:26 09FC I Getting parent router IOR from ServerIP:8192
    15.08.2011 15:23:26 09FC I This computer is part of the domain DOMAIN
    15.08.2011 15:23:49 09FC I This computer is part of the domain DOMAIN
    15.08.2011 15:23:49 09FC I Getting parent router IOR from SERVERHOST.IPDOMAIN:8192
    15.08.2011 15:24:12 09FC I Getting parent router IOR from SERVERHOSTNAME:8192
    15.08.2011 15:24:35 09FC E Failed to get parent router IOR
    15.08.2011 15:24:35 09FC E Failed to get certificate, retrying in 600 seconds
    15.08.2011 15:34:37 09FC I Getting parent router IOR from ServerIP:8192
    15.08.2011 15:35:00 09FC I Getting parent router IOR from SERVERHOST.IPDOMAIN:8192
    15.08.2011 15:35:23 09FC I Getting parent router IOR from SERVERHOSTNAME:8192
    15.08.2011 15:35:46 09FC E Failed to get parent router IOR
    15.08.2011 15:35:46 09FC E Failed to get certificate, retrying in 600 seconds

    Router log from server;
    15.08.2011 14:47:16 092C I Sent message (id=004993E4) to EM
    15.08.2011 14:47:36 0950 I Routing to EM: id=004993F8, origin=Router$PSYT0589.Agent, dest=EM, type=EM-GetStatus-Reply
    15.08.2011 14:47:36 0930 I Sent message (id=004993F8) to EM
    15.08.2011 14:55:35 0950 I Routing to EM: id=004995D7, origin=Router$PSYT0589.Agent, dest=EM, type=EM-GetStatus-Reply
    15.08.2011 14:55:35 0934 I Sent message (id=004995D7) to EM
    15.08.2011 15:05:33 0950 I Routing to EM: id=0049982D, origin=Router$PSYT0589.Agent, dest=EM, type=EM-GetStatus-Reply
    15.08.2011 15:05:33 0938 I Sent message (id=0049982D) to EM
    15.08.2011 15:06:10 088C I RouterSystemCheck::onInfoPortsUsed() - number of user ports 33, max number of user ports 3976
    15.08.2011 15:15:35 0950 I Routing to EM: id=00499A87, origin=Router$PSYT0589.Agent, dest=EM, type=EM-GetStatus-Reply
    15.08.2011 15:15:35 092C I Sent message (id=00499A87) to EM
    15.08.2011 15:25:35 0950 I Routing to EM: id=00499CDF, origin=Router$PSYT0589.Agent, dest=EM, type=EM-GetStatus-Reply
    15.08.2011 15:25:35 0930 I Sent message (id=00499CDF) to EM
    15.08.2011 15:35:33 0950 I Routing to EM: id=00499F35, origin=Router$PSYT0589.Agent, dest=EM, type=EM-GetStatus-Reply
    15.08.2011 15:35:33 0934 I Sent message (id=00499F35) to EM

    Yes the parent address value, in the registry of the client, is valid for the server.

    No, I can't telnet from the server to client or client to server, on those ports, yet the firewall settings on both machines show that the ports are open, as I have set in Group Policy.  Nothing has changed on the single switch that sits between the test client and server.  I manually started the telnet service on the server and had to enable telnet on my Windows 7 client and used the following syntax "telnet IPOfServer 8192".  It says "Could not open connection to the host, on port 8192: Connect failed".

    :15661
  • HI,

    Well the problem is definitely :

    "E Failed to get parent router IOR "

    The client router is unable to connect to port 8192 on the server to read the IOR string to tell it where to connect back to.  I assume you've edited the client log to anonymize the address the client is using to connect as:

    • ServerIP
    • SERVERHOST.IPDOMAIN
    • SERVERHOSTNAME

    don't look quite right.  Typically if the server has a static IP address the client will attempt to connect using a parentaddress string of:

    <IPAddress>, <FQDN>, <NETBIOS> 

    and it will try them in order if it is unable to resolve each in turn.  If the server is DHCP, the IP value would not exist.

    The router log on the server shows that it is sending messages to EM, which leads me to believe the server router is working and I would expect that it is hosting an IOR on port 8192 for the client to read.

    I would think if on the server you ran:

    telnet  SERVERHOST.IPDOMAIN 8192

    that must return the IOR string?

    For some reason though, the client is unable to connect to port 8192 of the server using SERVERHOST.IPDOMAIN .  If you've tried IP, it can't just be a resolution problem.  The only thing I can suggest is to confirm you can connect to port 8192 locally on the server (just to prove that it is there and ready) and after that, the firewall on the server does seem the most likely culprit.

    Can you check the logs or turn it off temporarily until you are able to telnet port 8192 from the client successfully.  I'm sure once you are communication will spring to life.

    Regards,

    Jak

    Note: you will also need the client to be able to connect to 8194 of the server, telnetting to port 8194 should also connect but it will not display anything.  This is also a valid test but the router connects to port 8194 once it has read the IOR from 8192 as it's failing to do that I would start with 8192.

    :15669
  • Hello Colin,

    telnet IPOfServer 8192 is supposed to work - on the server you should see (using for example TCPView) RouterNT.exe listening on ports 8192-8194 so the telnet service is not required. If you can't get a connection using the IP then the request is blocked somewhere on the way. Note that the Win7 firewall has both IN and OUT rules and can't be controlled with a GPO set on the server. As a quick test I suggest you just turn off the firewall on the client.

    Christian

    :15673