This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UTM startet selbstständig neu

Guten Morgen,

vor ein paar Minuten hat eine unsere UTMs selbstständig beschlossen das sie mal einen Neustart vertragen könnte.

Das High Availability Log zeigt mir nur Infos über die Übergabe an Node 2 und über den Sync-Prozess an.

Wo kann ich mir denn anschauen warum Node 1 auf einmal beschliesst sich selbst neu zu starten?

This thread was automatically locked due to age.

0 scorpionking over 10 years ago
System Log, Kernel Log, evtl. Fallback Log.
Ggf. war die UTM überlastet, wie ist denn eure Hardware, Konfioguration und Auslastung so?
----------
Sophos user, admin and reseller.
Private Setup:

XG: HPE DL20 Gen9 (Core i3-7300, 8GB RAM, 120GB SSD) | XG 18.0 (Home License) with: Web Protection, Site-to-Site-VPN (IPSec, RED-Tunnel), Remote Access (SSL, HTML5)

UTM: 2 vCPUs, 2GB RAM, 50GB vHDD, 2 vNICs on vServer (KVM) | UTM 9.7 (Home License) with: Email Protection, Webserver Protection, RED-Tunnel (server)
Cancel
Vote Up 0 Vote Down

Cancel
0 Revan over 10 years ago

Hardware steht ja in der Signatur, CPU-Auslastung ist nicht der Rede wert, RAM liegt bei 75%.
Aber auch bei Vollast darf die UTM doch nicht von selbst neu starten?

Ich hab vorgestern das letzte Update eingespielt und da auch (natürlich) neu gestartet.
Cancel
Vote Up 0 Vote Down

Cancel
0 scorpionking over 10 years ago
In einem HA-Cluster ist das wohl so. Wenn der Master überlastet ist und nicht mehr hinterherkommt, startet er wohl neu und lässt den Slave übernehmen. Zumindest hat mir das unser Distributor so beschrieben.

Warum das jetzt in deinem Fall konkret war, lässt sich normalerweise über die Analyse der Kernel und System messages rausfinden.
Ggf. musts du euren Reseller einschalten oder, falls Premium Support vorhanden, Sopho direkt kontaktieren.
----------
Sophos user, admin and reseller.
Private Setup:

XG: HPE DL20 Gen9 (Core i3-7300, 8GB RAM, 120GB SSD) | XG 18.0 (Home License) with: Web Protection, Site-to-Site-VPN (IPSec, RED-Tunnel), Remote Access (SSL, HTML5)

UTM: 2 vCPUs, 2GB RAM, 50GB vHDD, 2 vNICs on vServer (KVM) | UTM 9.7 (Home License) with: Email Protection, Webserver Protection, RED-Tunnel (server)
Cancel
Vote Up 0 Vote Down

Cancel

0 Revan over 10 years ago

Wenn das so ist hat sie das heut zum ersten Mal gemacht.

Kann es sein das es mit Database Errors angefangen hat?

2014:09:19-08:46:29 utm-1 postgres[5583]: [2-1] LOG:  database system was interrupted; last known up at 2014-09-19 06:45:37 GMT

2014:09:19-08:46:29 utm-1 postgres[5583]: [3-1] LOG:  entering standby mode

2014:09:19-08:46:29 utm-1 postgres[5583]: [4-1] LOG:  redo starts at 39/AE000020

2014:09:19-08:46:29 utm-1 postgres[5583]: [5-1] LOG:  record with zero length at 39/AE036F10

2014:09:19-08:46:29 utm-1 postgres[5590]: [2-1] LOG:  streaming replication successfully connected to primary

2014:09:19-08:46:29 utm-1 postgres[5583]: [6-1] LOG:  consistent recovery state reached at 39/AE038BC4

2014:09:19-08:46:29 utm-1 postgres[5567]: [2-1] LOG:  database system is ready to accept read only connections

2014:09:19-08:45:35 utm-2 syslog-ng[5084]: Syslog connection broken; fd='0', server='AF_INET(0.0.0.0:514)', time_reopen='60'

2014:09:19-08:46:59 utm-1 system: System was restarted

2014:09:19-08:47:01 utm-2 /usr/sbin/cron[28209]: (root) CMD (  nice -n19 /usr/local/bin/gen_inline_reporting_data.plx)

2014:09:19-08:47:01 utm-1 /usr/sbin/cron[6850]: (root) CMD (  nice -n19 /usr/local/bin/gen_inline_reporting_data.plx)

2014:09:19-08:47:29 utm-2 dns-resolver[26960]: DNS server failed to contact!

2014:09:19-08:48:29 utm-2 dns-resolver[26960]: DNS server failed to contact!

2014:09:19-08:49:29 utm-2 dns-resolver[26960]: DNS server failed to contact!

2014:09:19-08:50:01 utm-2 /usr/sbin/cron[28553]: (root) CMD (/var/mdw/scripts/pmx-blocklist-update)

2014:09:19-08:50:01 utm-2 /usr/sbin/cron[28552]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)

2014:09:19-08:50:01 utm-1 /usr/sbin/cron[7049]: (root) CMD (/var/mdw/scripts/pmx-blocklist-update)

2014:09:19-08:50:01 utm-1 /usr/sbin/cron[7050]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)

2014:09:19-08:50:30 utm-2 dns-resolver[26960]: DNS server failed to contact!

2014:09:19-08:50:36 utm-1 postgres[5567]: [3-1] LOG:  received fast shutdown request

2014:09:19-08:50:36 utm-1 postgres[5567]: [4-1] LOG:  aborting any active transactions

2014:09:19-08:50:36 utm-1 postgres[5590]: [3-1] FATAL:  terminating walreceiver process due to administrator command

2014:09:19-08:50:36 utm-1 postgres[5587]: [2-1] LOG:  shutting down

2014:09:19-08:50:36 utm-1 postgres[7173]: [5-1] FATAL:  the database system is shutting down

2014:09:19-08:50:36 utm-1 postgres[5587]: [3-1] LOG:  database system is shut down

2014:09:19-08:50:37 utm-1 postgres[7217]: [1-1] LOG:  loaded library "pg_stat_statements"

2014:09:19-08:50:37 utm-1 postgres[7234]: [2-1] LOG:  database system was shut down in recovery at 2014-09-19 06:50:36 GMT

2014:09:19-08:50:37 utm-1 postgres[7234]: [3-1] LOG:  database system was not properly shut down; automatic recovery in progress

2014:09:19-08:50:37 utm-1 postgres[7234]: [4-1] LOG:  redo starts at 39/AE000020

2014:09:19-08:50:37 utm-1 postgres[7234]: [5-1] LOG:  consistent recovery state reached at 39/AF129218

2014:09:19-08:50:37 utm-1 postgres[7234]: [6-1] LOG:  unexpected pageaddr 39/4312A000 in log file 57, segment 175, offset 1220608

2014:09:19-08:50:37 utm-1 postgres[7234]: [7-1] LOG:  redo done at 39/AF129188

2014:09:19-08:50:37 utm-1 postgres[7234]: [8-1] LOG:  last completed transaction was at log time 2014-09-19 06:50:36.106467+00

2014:09:19-08:50:38 utm-1 postgres[7294]: [2-1] FATAL:  the database system is starting up

2014:09:19-08:50:39 utm-1 postgres[7335]: [2-1] FATAL:  the database system is starting up

2014:09:19-08:50:42 utm-1 dns-resolver[7578]: starting...

2014:09:19-08:50:42 utm-1 postgres[7217]: [2-1] LOG:  database system is ready to accept connections

2014:09:19-08:50:42 utm-1 postgres[7579]: [2-1] FATAL:  the database system is starting up

2014:09:19-08:50:42 utm-1 postgres[7584]: [2-1] LOG:  autovacuum launcher started

2014:09:19-08:50:43 utm-2 postgres[27007]: [3-1] LOG:  received fast shutdown request

2014:09:19-08:50:43 utm-2 postgres[27007]: [4-1] LOG:  aborting any active transactions

2014:09:19-08:50:43 utm-2 postgres[28333]: [3-1] FATAL:  terminating connection due to administrator command

Im weiteren Verlauf kommen noch weitere Datenbank bezogene Errors:

2014:09:19-08:52:47 utm-2 syslog-ng[5084]: Configuration reload request received, reloading configuration;

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-1] ERROR:  cannot execute INSERT in a read-only transaction

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-2] CONTEXT:  SQL statement "insert into auth (

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-3] 			logtime, logday, srcip, username, facility, authresult

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-4] 		) values (

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-5] 			date_trunc('seconds', ts), day, ip,

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-6] 			auth_user, auth_facility, auth_result

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-7] 		)"

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-8] 	PL/pgSQL function ins_auth(timestamp without time zone,text,text,text,inet) line 7 at SQL statement

2014:09:19-08:52:46 utm-2 postgres[30055]: [3-9] STATEMENT:  select ins_auth($1, $2, $3, $4, $5)

Das Kernel Log hat leider nur Einträge nachdem die UTM schon neu gestartet war.

0 scorpionking over 10 years ago
Wenn das so ist hat sie das heut zum ersten Mal gemacht.
Passiert ja normalerweise auch nicht... [;)]

Mach doch bitte ein Ticket bei Sophos über euren Reseller auf, da scheint irgendwas im Argen zu liegen.
----------
Sophos user, admin and reseller.
Private Setup:

XG: HPE DL20 Gen9 (Core i3-7300, 8GB RAM, 120GB SSD) | XG 18.0 (Home License) with: Web Protection, Site-to-Site-VPN (IPSec, RED-Tunnel), Remote Access (SSL, HTML5)

UTM: 2 vCPUs, 2GB RAM, 50GB vHDD, 2 vNICs on vServer (KVM) | UTM 9.7 (Home License) with: Email Protection, Webserver Protection, RED-Tunnel (server)
Cancel
Vote Up 0 Vote Down

Cancel
0 Revan over 10 years ago

Zitat von Sophos:

Hallo Herr ***,

aktuell kann es zu einen Kernel Panik im HA/cluster Modus kommen und wir arbeiten bereits an einer Lösung. Der Entwickler sagte mir heute morgen das er ggf einen Kernel bauen kann in dem wir Debugoptionen aktiv haben.
Cancel
Vote Up 0 Vote Down

Cancel