This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Problems with Log-Disk --> [INFO-153] Log Disk is filling up - please check

Hello!

we have two Sophos SG330 nodes running in a cluster and got today this message:

 [INFO-153] Log Disk is filling up - please check

 

After logging  in I saw in the "Resource Usage" Tab on the Dashboard, that the log disk has 0 % free space. 

Moreover there was big pgsql files in the path: /var/log/reporting/pgsql

In the system.log i saw, that no database entries could be written:

 

2020:08:03-13:14:21 fw-1 postgres[32344]: [3-2] DETAIL:  Continuing anyway, but there's something wrong.
2020:08:03-13:14:41 fw-1 postgres[32617]: [3-1] FATAL:  could not write init file
2020:08:03-13:14:51 fw-1 dns-resolver[12696]: No change to REF_NetDnsInetnutani2 :: ntnx-portal.s3.amazonaws.com
2020:08:03-13:14:52 fw-1 dns-resolver[12696]: Updating REF_NetDnsInetsuppor :: proxy.rubrik.com
2020:08:03-13:14:52 fw-1 dns-resolver[12696]: Updating REF_NetDnsInetboehri :: boehringer-my.sharepoint.com
2020:08:03-13:15:01 fw-2 /usr/sbin/cron[17988]: (root) CMD ( /usr/local/bin/rpmdb_backup )
2020:08:03-13:15:01 fw-2 /usr/sbin/cron[17987]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
2020:08:03-13:15:01 fw-1 /usr/sbin/cron[468]: (root) CMD (   /usr/local/bin/reporter/system-reporter.pl)
2020:08:03-13:15:01 fw-1 /usr/sbin/cron[472]: (root) CMD ( /usr/local/bin/rpmdb_backup )
2020:08:03-13:15:01 fw-1 /usr/sbin/cron[481]: (httpproxy) CMD (/var/chroot-http/usr/bin/virus_feedback_uploader)
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-1] ERROR:  could not extend file "pg_tblspc/16774/PG_9.2_201204301/16775/17229518": No space left on device
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-2] HINT:  Check free disk space.
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-3] CONTEXT:  SQL statement "with upsert (reqno) as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-4] 			update websecurity set
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-5] 				requests = requests + reqs,
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-6] 				pages = pages + reqpages,
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-7] 				transfer = transfer + bytes
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-8] 			where
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-9] 				logday = day and userid = uid and
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-10] 				protocol = reqproto and domain = reqdomain and site = reqsite and
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-11] 				path = reqpath and action = reqaction and reason = reqreason and
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-12] 				info = reqinfo
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-13] 			returning _rowno
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-14] 		), insweb (reqno) as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-15] 			insert into websecurity (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-16] 				logday, userid, protocol, domain, site, path,
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-17] 				action, reason, info, requests, pages, transfer
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-18] 			) select
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-19] 				day, uid, reqproto, reqdomain, reqsite, reqpath,
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-20] 				reqaction, reqreason, reqinfo, reqs, reqpages, bytes
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-21] 			where not exists (select 1 from upsert)
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-22] 			returning _rowno
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-23] 		), inscat (reqno) as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-24] 			insert into websec_reqcat (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-25] 				request, category
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-26] 			) select
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-27] 				insweb.reqno, unnest(categories)
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-28] 			from
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-29] 				insweb
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-30] 			returning request
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-31] 	  ), inssand as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-32] 	    insert into websec_req_sandbox (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-33] 	      logday, sandbox
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-34] 	    ) select
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-35] 	      day, reqsandbox
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-36] 	      from insweb
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-37] 	      where reqsandbox IS NOT NULL
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-38] 	  ), upsand as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-39] 	    insert into websec_req_sandbox (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-40] 	      logday, sandbox
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-41] 	    ) select
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-42] 	      day, reqsandbox
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-43] 	      from upsert
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-44] 	      where reqsandbox IS NOT NULL
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-45] 	  ), insquota as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-46] 	    insert into websec_req_quota (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-47] 	      quotatime, _rowno
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-48] 	    ) select
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-49] 	      reqquotatime, insweb.reqno from insweb
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-50] 			where reqquotatime > 0
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-51] 	  ), upquota as (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-52] 	    update websec_req_quota set quotatime = quotatime + reqquotatime
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-53] 	    from upsert
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-54] 	    where _rowno = upsert.reqno
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-55] 		)insert into websec_reqdpt (
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-56] 			func, request, dptid
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-57] 		) select
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-58] 			'requests', insweb.reqno, get_departmentid(unnest(departments))
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-59] 		from
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-60] 			insweb"
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-61] 	PL/pgSQL function ins_websecurity(date,text,inet,text,text,text,text,text,text,text,integer[],text[],text,bigint,bigint,bigint,bigint) line 9 at SQL statement
2020:08:03-13:15:12 fw-1 postgres[32344]: [4-62] STATEMENT:  select ins_websecurity($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14::int4, $15::int4, $16::int4, $17::int8)
2020:08:03-13:15:12 fw-1 postgres[736]: [3-1] WARNING:  could not create relation-cache initialization file "pg_tblspc/16774/PG_9.2_201204301/16775/pg_internal.init.736": No space left on device
2020:08:03-13:15:12 fw-1 postgres[736]: [3-2] DETAIL:  Continuing anyway, but there's something wrong.





My Steps then:

1. I deleted old packetfilter and webfilter logs (before i backed it up!)

2. then i run the command (console - root user): /etc/init.d/postgresql92 rebuild


3. after rebuilding the file list was built

2020:08:03-16:41:08 fw-1 [local7:err] waf-reporter[18147]: Transaction error: server closed the connection unexpectedly
2020:08:03-16:41:08 fw-1 [user:notice] This probably means the server terminated abnormally
2020:08:03-16:41:08 fw-1 [user:notice] before or while processing the request.
2020:08:03-16:41:08 fw-1 [local7:err] waf-reporter[18147]: Failed database transaction
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27295]: connect from node2 (198.19.250.2)
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27295]: rsync on postgres-default/postgres.default from node2 (198.19.250.2)
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27295]: building file list
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27295]: sent 104 bytes received 39 bytes total size 45
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27296]: connect from node2 (198.19.250.2)
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27296]: rsync on pg_default/global/pg_control from node2 (198.19.250.2)
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27296]: building file list
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27296]: sent 681 bytes received 124 bytes total size 8192
2020:08:03-16:41:42 fw-1 [daemon:info] rsyncd[27298]: connect from node2 (198.19.250.2)
2020:08:03-16:41:43 fw-1 [daemon:info] rsyncd[27298]: rsync on reporting/ from node2 (198.19.250.2)
2020:08:03-16:41:43 fw-1 [daemon:info] rsyncd[27298]: building file list
2020:08:03-16:41:43 fw-1 [daemon:info] rsyncd[27298]: sent 571379 bytes received 19561 bytes total size 7889412
2020:08:03-16:41:43 fw-1 [daemon:info] rsyncd[27300]: connect from node2 (198.19.250.2)
2020:08:03-16:41:43 fw-1 [daemon:info] rsyncd[27300]: rsync on pg_default/ from node2 (198.19.250.2)
2020:08:03-16:41:43 fw-1 [daemon:info] rsyncd[27300]: building file list
2020:08:03-16:41:44 fw-1 [daemon:info] rsyncd[27300]: sent 86561983 bytes received 26620 bytes total size 151789207
2020:08:03-16:41:45 fw-2 [daemon:debug] rrdcached[4194]: flushing old values
2020:08:03-16:41:45 fw-2 [daemon:debug] rrdcached[4194]: rotating journals
2020:08:03-16:41:45 fw-2 [daemon:debug] rrdcached[4194]: started new journal /var/log/reporting/rrd/rrd.journal.1596465705.579905
2020:08:03-16:41:45 fw-2 [daemon:debug] rrdcached[4194]: removing old journal /var/log/reporting/rrd/rrd.journal.1596458505.579930
 
 

4. Then, after rebuilding and resyncing and receving a good system state, I upgraded to latest UTM version: 9.703-3

5. Now the HA-System is fully functional






Now i have the following question... In the "Resource Usage"-Tab I see that still 89 % of 80.6 GB are used...



If I run "du" command via command line, i get the following:



=> if i add the values then i get ~ 46 GB ---- Where are the remaining gigabytes to 69 GB?
=> Can someone tell my, why the confd-debug log ist "so big?" Can i set a debug level for confd?


thank you so far!




This thread was automatically locked due to age.
Parents Reply Children
  • Hello,

     

    now we have the problem with filling up log disk again.

     

    The problem ist, that out http.log is very big: 43 GB (only today)

     

    Dos anyone had the same problem?

  • Hallo Bepo,

    Rather than a picture, please copy and paste about 10 lines from the middle of the Web Filtering log.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hello Bob,

    because of the problems i have done the following:

     

    On Node #1

    1. rename "http.log" to "http_old.log" 

    2. touched a new "http.log"

    3. command: "chown root:log http.log"

    4. Rebooted Node #1 

    On Node #2

    5. rename "http.log" to "http_old.log" 

    6. touched a new "http.log"

    7. command: "chown root:log http.log"

    8. Rebooted Node #2 after HA SYNC completed

     

    Then i deleted the "http_old.log" on both Nodes. (41 GB)