Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

ips.log filling at high rate - normal and good for the SSD lifetime?

Today we've had a partial outage due to high /var partition usage.

It was flapping between 70% and over 90% in a short time.

/dev/var                179.3G    138.6G     40.7G  77% /var
/dev/var                179.3G    138.8G     40.5G  77% /var
/dev/var                179.3G    138.9G     40.4G  77% /var
/dev/var                179.3G    139.0G     40.3G  77% /var
/dev/var                179.3G    131.0G     48.3G  73% /var
/dev/var                179.3G    135.7G     43.6G  76% /var
/dev/var                179.3G    141.5G     37.8G  79% /var
/dev/var                179.3G    141.5G     37.8G  79% /var
/dev/var                179.3G    141.6G     37.7G  79% /var
/dev/var                179.3G    141.6G     37.7G  79% /var
/dev/var                179.3G    141.7G     37.6G  79% /var
/dev/var                179.3G    141.7G     37.6G  79% /var
/dev/var                179.3G    141.8G     37.5G  79% /var
/dev/var                179.3G    144.4G     34.8G  81% /var
/dev/var                179.3G    137.8G     41.5G  77% /var
/dev/var                179.3G    137.9G     41.4G  77% /var
/dev/var                179.3G    137.9G     41.4G  77% /var
/dev/var                179.3G    138.0G     41.3G  77% /var
/dev/var                179.3G    139.5G     39.8G  78% /var
/dev/var                179.3G    146.0G     33.3G  81% /var
/dev/var                179.3G    143.0G     36.3G  80% /var
/dev/var                179.3G    144.3G     35.0G  80% /var
/dev/var                179.3G    124.5G     54.8G  69% /var
/dev/var                179.3G    124.6G     54.7G  69% /var
/dev/var                179.3G    124.7G     54.6G  70% /var
/dev/var                179.3G    124.8G     54.5G  70% /var
/dev/var                179.3G    124.9G     54.4G  70% /var
/dev/var                179.3G    125.1G     54.2G  70% /var
/dev/var                179.3G    125.4G     53.9G  70% /var
/dev/var                179.3G    171.1G      8.1G  95% /var
/dev/var                179.3G    148.8G     30.5G  83% /var
/dev/var                179.3G    155.9G     23.4G  87% /var
/dev/var                179.3G    146.0G     33.3G  81% /var
/dev/var                179.3G    144.5G     34.8G  81% /var
/dev/var                179.3G     55.2G    124.1G  31% /var
/dev/var                179.3G     55.3G    124.0G  31% /var
/dev/var                179.3G     56.4G    122.9G  31% /var
/dev/var                179.3G     56.4G    122.9G  31% /var
/dev/var                179.3G     56.8G    122.5G  32% /var

Firewall was rotating the IPS Logs at a high rate and compressing them caused additional CPU load.

The heartbeat service failed as secondary issue causing heartbeat rules not to work any more.

Also we've had a huge ammount of these IPS detections:

20583
BROWSER-FIREFOX Mozilla multiple location headers malicious redirect attempt

Together with Support on the phone, we flushed the report partition and rebooted the firewall after stopping IPS.

/var was at 31% then.

IPS was manually started again after the reboot.

Now it seem to me IPS is still logging at a high rate. about 1GB to the text file /log/ips.log per Minute.

Support could not find issues related to IPS.

I'm unsure.

I wonder if such a high log rate is normal for IPS and if this is OK for the SSD?

XGS4500_AM02_SFOS 21.0.0 GA-Build169 HA-Primary# ls -lhS ips.*
-rw-r--r--    1 root     root       24.4G Nov 21 11:22 ips.log-20241121_111359
-rw-r--r--    1 root     root        1.0G Nov 21 13:16 ips.log
-rw-r--r--    1 root     root      123.4M Nov 21 13:06 ips.log-20241121_130655.gz
-rw-r--r--    1 root     root      116.9M Nov 21 13:11 ips.log-20241121_131102.gz
-rw-r--r--    1 root     root      109.3M Nov 21 13:14 ips.log-20241121_131454.gz
-rw-r--r--    1 root     root      105.8M Nov 21 13:13 ips.log-20241121_131303.gz
-rw-r--r--    1 root     root      102.8M Nov 21 13:09 ips.log-20241121_130901.gz
XGS4500_AM02_SFOS 21.0.0 GA-Build169 HA-Primary# head /log/ips.log
2024-11-21T12:15:07.934223Z [25339] daq_metadata session id 5158 rev 36361 appid 106  hbappid 0 idp 14 appfltid 0
2024-11-21T12:15:07.935265Z [25340] daq_metadata session id 19208 rev 51831 appid 106  hbappid 0 idp 14 appfltid 0

Log is filling even when this is still disabled (disabled during case debug today)

Support case is 02034074 



changed topic "hat" high rate -> at high rate
[bearbeitet von: LHerzog um 1:38 PM (GMT -8) am 21 Nov 2024]
Parents
  • the 24GB ips stale log file has been archived and then deleted manually.
    and two other screenshots

  • @Sophos do we have a confirmation that ips.log should be as noisy as it is? How would a ips.log look like on a random mid size firewall?

    Those tiny CPU peaks every ode or two minute are only because the firewall is compressing new IPS logs into the .gz files all the time.

    lines from our ips.log. look at the timestamps.

    2024-11-22T08:32:13.206874Z [27132] verdict 0 pktnum 246732 flowoff [APP|AV] appcat [22245:25] dsize 1460 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:49524]
    2024-11-22T08:32:13.206877Z [27132] daq_metadata session id 10238 rev 9840 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.206883Z [27132] verdict 0 pktnum 246733 flowoff [APP|AV] appcat [22245:25] dsize 1460 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:49524]
    2024-11-22T08:32:13.206885Z [27132] daq_metadata session id 10238 rev 9840 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.206891Z [27132] verdict 0 pktnum 246734 flowoff [APP|AV] appcat [22245:25] dsize 362 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:49524]
    2024-11-22T08:32:13.207020Z [27128] daq_metadata session id 16708 rev 44417 appid 100  hbappid 0 idp 6 appfltid 3
    2024-11-22T08:32:13.207031Z [27128] verdict 3 pktnum 40 flowoff [APP|WEB|AV] appcat [100:5] dsize 0 [xxx.xxx.xxx.xxx:57041 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207187Z [27132] daq_metadata session id 186 rev 30620 appid 0  hbappid 0 idp 18 appfltid 0
    2024-11-22T08:32:13.207194Z [27132] verdict 0 pktnum 1904446 flowoff [IPS|WEB|AV] appcat [0:0] dsize 186 [xxx.xxx.xxx.xxx:43153 -> xxx.xxx.xxx.xxx:514]
    2024-11-22T08:32:13.207387Z [27128] daq_metadata session id 19369 rev 24179 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207403Z [27128] verdict 0 pktnum 42303 flowoff [APP|AV] appcat [22245:25] dsize 76 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:62628]
    2024-11-22T08:32:13.207418Z [27132] daq_metadata session id 11766 rev 21901 appid 100  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207426Z [27132] verdict 3 pktnum 677 flowoff [APP|WEB|AV] appcat [100:5] dsize 0 [xxx.xxx.xxx.xxx:40222 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207460Z [27128] daq_metadata session id 13230 rev 4211 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207490Z [27128] verdict 0 pktnum 590486 flowoff [APP|AV] appcat [22245:25] dsize 92 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:57352]
    2024-11-22T08:32:13.207505Z [27128] daq_metadata session id 1827 rev 52159 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207522Z [27132] daq_metadata session id 15394 rev 58396 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207522Z [27127] daq_metadata session id 19396 rev 48441 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207527Z [27128] verdict 0 pktnum 48110 flowoff [APP|AV] appcat [22245:25] dsize 140 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:62620]
    2024-11-22T08:32:13.207538Z [27127] verdict 0 pktnum 48051 flowoff [APP|AV] appcat [22245:25] dsize 140 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:61487]
    2024-11-22T08:32:13.207543Z [27132] verdict 0 pktnum 48182 flowoff [APP|AV] appcat [22245:25] dsize 140 [xxx.xxx.xxx.xxx:55752 -> xxx.xxx.xxx.xxx:49507]
    2024-11-22T08:32:13.207589Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207600Z [27127] verdict 0 pktnum 49464 flowoff [APP|WEB|AV] appcat [20019:25] dsize 1460 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207603Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207612Z [27127] verdict 0 pktnum 49465 flowoff [APP|WEB|AV] appcat [20019:25] dsize 1460 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207614Z [27132] daq_metadata session id 11766 rev 21901 appid 100  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207615Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207625Z [27127] verdict 0 pktnum 49466 flowoff [APP|WEB|AV] appcat [20019:25] dsize 1460 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207629Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207633Z [27132] verdict 3 pktnum 678 flowoff [APP|WEB|AV] appcat [100:5] dsize 0 [xxx.xxx.xxx.xxx:40222 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207639Z [27127] verdict 0 pktnum 49467 flowoff [APP|WEB|AV] appcat [20019:25] dsize 1460 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207642Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207646Z [27132] daq_metadata session id 7348 rev 8853 appid 2792  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207647Z [27128] daq_metadata session id 1875 rev 27197 appid 0  hbappid 0 idp 6 appfltid 0
    2024-11-22T08:32:13.207653Z [27127] verdict 0 pktnum 49468 flowoff [APP|WEB|AV] appcat [20019:25] dsize 1460 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207656Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207655Z [27132] verdict 3 pktnum 94 flowoff [APP|WEB|AV] appcat [2792:12] dsize 0 [xxx.xxx.xxx.xxx:33666 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207660Z [27132] daq_metadata session id 11766 rev 21901 appid 100  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207665Z [27127] verdict 0 pktnum 49469 flowoff [APP|WEB|AV] appcat [20019:25] dsize 1460 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207658Z [27128] verdict 0 pktnum 62572 flowoff [IPS|APP] appcat [0:0] dsize 168 [xxx.xxx.xxx.xxx:42078 -> xxx.xxx.xxx.xxx:9997]
    2024-11-22T08:32:13.207668Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207669Z [27132] verdict 3 pktnum 679 flowoff [APP|WEB|AV] appcat [100:5] dsize 0 [xxx.xxx.xxx.xxx:40222 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207677Z [27127] verdict 0 pktnum 49470 flowoff [APP|WEB|AV] appcat [20019:25] dsize 729 [xxx.xxx.xxx.xxx:60979 -> xxx.xxx.xxx.xxx:3268]
    2024-11-22T08:32:13.207682Z [27132] daq_metadata session id 11766 rev 21901 appid 100  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207689Z [27132] verdict 3 pktnum 680 flowoff [APP|WEB|AV] appcat [100:5] dsize 0 [xxx.xxx.xxx.xxx:40222 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207788Z [27128] daq_metadata session id 1875 rev 27197 appid 0  hbappid 0 idp 6 appfltid 0
    2024-11-22T08:32:13.207793Z [27127] daq_metadata session id 19396 rev 48441 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207798Z [27128] verdict 0 pktnum 62519 flowoff [IPS|APP] appcat [0:0] dsize 0 [xxx.xxx.xxx.xxx:9997 -> xxx.xxx.xxx.xxx:42078]
    2024-11-22T08:32:13.207805Z [27127] verdict 0 pktnum 35805 flowoff [APP|AV] appcat [22245:25] dsize 0 [xxx.xxx.xxx.xxx:61487 -> xxx.xxx.xxx.xxx:55752]
    2024-11-22T08:32:13.207816Z [27132] daq_metadata session id 11766 rev 21901 appid 100  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207824Z [27132] verdict 3 pktnum 681 flowoff [APP|WEB|AV] appcat [100:5] dsize 0 [xxx.xxx.xxx.xxx:40222 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207843Z [27128] daq_metadata session id 13230 rev 4211 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207853Z [27128] verdict 0 pktnum 545486 flowoff [APP|AV] appcat [22245:25] dsize 0 [xxx.xxx.xxx.xxx:57352 -> xxx.xxx.xxx.xxx:55752]
    2024-11-22T08:32:13.207865Z [27128] daq_metadata session id 1827 rev 52159 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207874Z [27128] verdict 0 pktnum 33047 flowoff [APP|AV] appcat [22245:25] dsize 0 [xxx.xxx.xxx.xxx:62620 -> xxx.xxx.xxx.xxx:55752]
    2024-11-22T08:32:13.207877Z [27128] daq_metadata session id 19369 rev 24179 appid 22245  hbappid 22245 idp 6 appfltid 3
    2024-11-22T08:32:13.207878Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207885Z [27128] verdict 0 pktnum 22436 flowoff [APP|AV] appcat [22245:25] dsize 0 [xxx.xxx.xxx.xxx:62628 -> xxx.xxx.xxx.xxx:55752]
    2024-11-22T08:32:13.207887Z [27127] verdict 0 pktnum 32526 flowoff [APP|WEB|AV] appcat [20019:25] dsize 0 [xxx.xxx.xxx.xxx:3268 -> xxx.xxx.xxx.xxx:60979]
    2024-11-22T08:32:13.207908Z [27127] daq_metadata session id 17528 rev 26228 appid 20019  hbappid 20019 idp 14 appfltid 0
    2024-11-22T08:32:13.207915Z [27127] verdict 0 pktnum 32527 flowoff [APP|WEB|AV] appcat [20019:25] dsize 0 [xxx.xxx.xxx.xxx:3268 -> xxx.xxx.xxx.xxx:60979]
    2024-11-22T08:32:13.207926Z [27127] daq_metadata session id 6610 rev 8910 appid 19433  hbappid 19433 idp 6 appfltid 0
    2024-11-22T08:32:13.207936Z [27127] verdict 0 pktnum 93598 flowoff [APP|WEB|AV] appcat [19433:25] dsize 10 [xxx.xxx.xxx.xxx:49761 -> xxx.xxx.xxx.xxx:3389]
    2024-11-22T08:32:13.207942Z [27132] daq_metadata session id 7348 rev 8853 appid 2792  hbappid 0 idp 5 appfltid 4
    2024-11-22T08:32:13.207950Z [27132] verdict 3 pktnum 95 flowoff [APP|WEB|AV] appcat [2792:12] dsize 0 [xxx.xxx.xxx.xxx:33666 -> xxx.xxx.xxx.xxx:443]
    2024-11-22T08:32:13.207991Z [27128] daq_metadata session id 1875 rev 27197 appid 0  hbappid 0 idp 6 appfltid 0
    2024-11-22T08:32:13.207999Z [27128] verdict 0 pktnum 62573 flowoff [IPS|APP] appcat [0:0] dsize 43 [xxx.xxx.xxx.xxx:42078 -> xxx.xxx.xxx.xxx:9997]
    2024-11-22T08:32:13.208085Z [27128] daq_metadata session id 1875 rev 27197 appid 0  hbappid 0 idp 6 appfltid 0
    2024-11-22T08:32:13.208092Z [27128] verdict 0 pktnum 62520 flowoff [IPS|APP] appcat [0:0] dsize 0 [xxx.xxx.xxx.xxx:9997 -> xxx.xxx.xxx.xxx:42078]
    

  • there is not much of an example in that thread but I think they've had an issue with log lines repeating "jumbogram" string.

    Also they were talking about some MB per minute, not GB like here on my XGS:

    ls -lhS --full-time /log/ips.log | awk '{print $5,$7,$9}'
    9.2G 10:17:13 /log/ips.log
    ls -lhS --full-time /log/ips.log | awk '{print $5,$7,$9}'
    9.3G 10:17:17 /log/ips.log
    ls -lhS --full-time /log/ips.log | awk '{print $5,$7,$9}'
    9.4G 10:17:21 /log/ips.log
    ls -lhS --full-time /log/ips.log | awk '{print $5,$7,$9}'
    9.6G 10:17:25 /log/ips.log
    ls -lhS --full-time /log/ips.log | awk '{print $5,$7,$9}'
    9.8G 10:17:29 /log/ips.log

    -

    -ls -lhS --full-time /log/ips.log | awk '{print $5,$7}'
    2.4G 10:20:20
    ls -lhS --full-time /log/ips.log | awk '{print $5,$7}'
    7.0G 10:22:59
    ls -lhS --full-time /log/ips.log | awk '{print $5,$7}'
    14.2G 10:26:31

    Our log file looks like above - it looks like this even when the IPS scanning is disabled but the IPS service is running. The screenshot from yesterday:

    ips.log contains logs about every connection and some other event, too.

  • and now I observe, it does not log-rotate correctly at the moment. files growing bigger and bigger.
    That must be the same situation as yesterday.

    ls -lhS --full-time /log/ips.* | awk '{print $5,$7,$9}'
    25.7G 10:32:36 /log/ips.log
    17.3G 10:32:36 /log/ips.log-20241122_102838
    745.0M 10:16:12 /log/ips.log-20241122_101612.gz
    557.8M 09:52:04 /log/ips.log-20241122_095204.gz
    479.4M 09:58:58 /log/ips.log-20241122_095858.gz
    461.3M 10:04:36 /log/ips.log-20241122_100436.gz
    436.6M 10:10:14 /log/ips.log-20241122_101014.gz

    an absolutely dangerous situation:

  • IPS debugging is currently enabled. You can toggle the debugging by using the command below.

    service ips:debug -ds nosync

    This won't be enabled by default. Was it enabled manually?

  • no, sorry, the service is not running in debug triple confirmed by Sophos Support and me.

    meanwhile even the zipped .gz files getting bigger and bigger now reaching about 2GB while they were at 300MB yesterday

    /dev/var                179.3G    112.3G     67.0G  63% /var
    ls -lhS --full-time /log/ips.* | awk '{print $5,$7,$9}'
    41.2G 14:48:56 /log/ips.log-20241122_143858
    28.5G 15:09:24 /log/ips.log
    2.3G 12:36:43 /log/ips.log-20241122_123643.gz
    1.9G 13:06:46 /log/ips.log-20241122_130646.gz
    1.8G 15:09:24 /log/ips.log-20241122_143858.gz
    1.7G 14:13:53 /log/ips.log-20241122_141353.gz
    1.7G 13:30:50 /log/ips.log-20241122_133050.gz
    1.5G 13:52:49 /log/ips.log-20241122_135249.gz
    XGS4500_AM02_SFOS 21.0.0 GA-Build169 HA-Primary# service -S |grep -i ips
    ips                  RUNNING
    ipsec-monitor        RUNNING

    Debug would show


    service -S |grep -i ips
    ips                  RUNNING,DEBUG
    ipsec-monitor        RUNNING

    GES is working on the issue. Confirmed logging is not normal.

  • we observed that IPS in Debug mode logs way less than when running in non-Debug mode.

    Factor was about 1:4

    In non-debug there was all the connection information in the logs.

    GES believes there is some kind of low-level default Debug enabled, which is ruled out when the service is put to debug mode.

    NON-DEBUG:
    2024-11-22T15:03:20.480455Z [27128] daq_metadata session id 5694 rev 6056 appid 0  hbappid 0 idp 18 appfltid 0
    2024-11-22T15:03:20.480461Z [27128] verdict 0 pktnum 21011930 flowoff [IPS|WEB|AV] appcat [0:0] dsize 388 [xxx.xxx.xxx.xxx:46620 -> xxx.xxx.xxx.xxx:514]
    2024-11-22T15:03:20.480464Z [27128] daq_metadata session id 5694 rev 6056 appid 0  hbappid 0 idp 18 appfltid 0
    2024-11-22T15:03:20.480468Z [27128] verdict 0 pktnum 21011931 flowoff [IPS|WEB|AV] appcat [0:0] dsize 401 [xxx.xxx.xxx.xxx:46620 -> xxx.xxx.xxx.xxx:514]
    2024-11-22T15:03:20.480470Z [27128] daq_metadata session id 5694 rev 6056 appid 0  hbappid 0 idp 18 appfltid 0
    2024-11-22T15:03:20.480475Z [27128] verdict 0 pktnum 21011932 flowoff [IPS|WEB|AV] appcat [0:0] dsize 400 [xxx.xxx.xxx.xxx:46620 -> xxx.xxx.xxx.xxx:514]
    2024-11-22T15:03:20.480477Z [27128] daq_metadata session id 5694 rev 6056 appid 0  hbappid 0 idp 18 appfltid 0
    2024-11-22T15:03:20.480482Z [27128] verdict 0 pktnum 21011933 flowoff [IPS|WEB|AV] appcat [0:0] dsize 389 [xxx.xxx.xxx.xxx:46620 -> xxx.xxx.xxx.xxx:514]
    2024-11-22T15:03:20.480523Z [27128] daq_metadata session id 12985 rev 11175 appid 2712  hbappid 0 idp 5 appfltid 4
    2024-11-22T15:03:20.480533Z [27128] verdict 0 pktnum 7752 flowoff [APP|WEB|AV] appcat [2712:5] dsize 25 [xxx.xxx.xxx.xxx:443 -> xxx.xxx.xxx.xxx:42054]
    2024-11-22T15:03:20.480604Z [27128] daq_metadata session id 12985 rev 11175 appid 2712  hbappid 0 idp 5 appfltid 4
    2024-11-22T15:03:20.480612Z [27128] verdict 0 pktnum 7753 flowoff [APP|WEB|AV] appcat [2712:5] dsize 1250 [xxx.xxx.xxx.xxx:443 -> xxx.xxx.xxx.xxx:42054]
    2024-11-22T15:03:20.480650Z [27128] daq_metadata session id 12985 rev 11175 appid 2712  hbappid 0 idp 5 appfltid 4
    2024-11-22T15:03:20.480656Z [27128] verdict 0 pktnum 7754 flowoff [APP|WEB|AV] appcat [2712:5] dsize 1250 [xxx.xxx.xxx.xxx:443 -> xxx.xxx.xxx.xxx:42054]
    
    
    DEBUG:
    2024-11-22T15:00:19.086120Z [27127/0x57ea000049e5] [nsg_request_fsm.c:542:request_fsm_body_bytes] Event body_bytes in state RESP_PARSE_BODY_SKIP
    2024-11-22T15:00:19.086192Z [27127/0x57ea000049e5] [acl/acl_lookup.c:608:check_access_list] ACL check: USER matched for WEB rule 0 (user-id=24)
    2024-11-22T15:00:19.086196Z [27127/0x57ea000049e5] [acl/acl_lookup.c:611:check_access_list] checking policy [000], if matched, allow
    2024-11-22T15:00:19.086199Z [27127/0x57ea000049e5] [acl/acl_lookup.c:392:__check_acl] checking 'all type 1'
    2024-11-22T15:00:19.086202Z [27127/0x57ea000049e5] [acl/acl_lookup.c:623:check_access_list] All ACLs matched: rule action = allow
    2024-11-22T15:00:19.086206Z [27127/0x57ea000049e5] [nsg_request_fsm.c:542:request_fsm_body_bytes] Event body_bytes in state RESP_PARSE_BODY_SKIP
    2024-11-22T15:00:19.086209Z [27127/0x57ea000049e5] [acl/acl_lookup.c:608:check_access_list] ACL check: USER matched for WEB rule 0 (user-id=24)
    2024-11-22T15:00:19.086211Z [27127/0x57ea000049e5] [acl/acl_lookup.c:611:check_access_list] checking policy [000], if matched, allow
    2024-11-22T15:00:19.086213Z [27127/0x57ea000049e5] [acl/acl_lookup.c:392:__check_acl] checking 'all type 1'
    2024-11-22T15:00:19.086216Z [27127/0x57ea000049e5] [acl/acl_lookup.c:623:check_access_list] All ACLs matched: rule action = allow
    2024-11-22T15:00:19.086218Z [27127/0x57ea000049e5] [nsg_request_fsm.c:542:request_fsm_body_bytes] Event body_bytes in state RESP_PARSE_BODY_SKIP
    2024-11-22T15:00:19.086230Z [27127/0x57ea000049e5] [acl/acl_lookup.c:608:check_access_list] ACL check: USER matched for WEB rule 0 (user-id=24)
    2024-11-22T15:00:19.086232Z [27127/0x57ea000049e5] [acl/acl_lookup.c:611:check_access_list] checking policy [000], if matched, allow
    

  • service -S |grep -i ips is showing opposite status hence you are seeing less logs when you enable debug (which actually disabled the debug). we are looking into this issue.

  • The debugp mask was configured in /var/tmp/debugp.conf, which enabled debugging. To disable it, you might have run the command service ips:debugp -ds nosync. However, existing connections will still use the mask that was set when they first arrived.

    I have updated the mask to 0 in /var/tmp/debugp.conf and restarted the IPS, which has fully disabled the WIS debug mode. The IPS restart also resolved the CSC status issue.

    No debug logs are being generated after the IPS restart. Please confirm.

  • Confirmed - Logs back to normal! Great and Thank you very much!

    And was on the right track from the beginning!

    I created /var/tmp/debugp.conf for the case mentioned here  Webfilter HTTPS decryption breaks ChatGPT: HTTP parsing error encountered

    That file contained my own IP address for the ChatGPT debug.

    LuCar mentionend  https://community.sophos.com/sophos-xg-firewall/f/discussions/145539/ips-log-filling-up-disk where  suggested to search and delete the debugp log file. I missed that.

    Debug was enabled/disabled 3 times using service -ds nosync ips:debugp

      I communicated 4 times, but found only 6 status code 200 response.

    service -ds nosync ips:debugp
    200 OK

    I expected the file to be ignored when service -ds nosync ips:debugp was used to disable debug again.

    Eventually it enables itself again when the firewall is rebooted? That could be checked. Nilesh wrote me, when IPS starts, it loads the debugp file. That was the case here because the firewall was rebooted one  week ago for the v21 upgrade.

    Sooner or later this wold have happened anyway.

    So it's very important to delete the debugp file after the tests or at least have it containing only

    mask=0

    Maybe that is something for the Doc Team @R  for the article https://support.sophos.com/support/s/article/KBA-000006577?language=en_US

    Thanks a lot everyone involved here! It was an exciting experience!

  • This has occurred in a few cases recently where support/GES/partner/??? are not deleting debug flag files.

    awarrenhttp:
    /sdisk/tmp/debug.cfg

    ips:
    /var/tmp/debugp.conf


    Both awarrenhttp and ips have special debug flag files that allow for very finetuned levels of logging to be enabled and disabled (a mask of bits that turn on/off areas of logs) .  When awarrenhttp and ips are started they look for the presence of the file.  If found, they turn on DEBUG logging.  However there is no way of telling csc/service that DEBUG is on.  Therefore the output of the "service -S" command is incorrect.

    It is critical to do two things:
    1) Delete the flag files, ideally immediately after enabling debugging so you do not forget
    2) Do not rely on "service -S" to tell you if DEBUG is on.  Rely on the whether the log file is filling up with debug lines.

    I did not know that we are publishing GES MER publicly.

  • just for reference after the issue is fixed:

    this is how a regular ips.log would look like

    2024-11-25T08:06:52.172001Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:55909 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Stream [0xbe00683e;code:62;sub:104] Flow reset
    2024-11-25T08:06:52.172116Z [25865/0x0] [nsg_tcphold.c:1568:tcp_hold_process_control] Failed to get tcp_hold_session ssnptr 0x1dcb7b10 (hold_state: 3).
    2024-11-25T08:06:52.189184Z [25863/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:55951 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Stream [0xbe00683e;code:62;sub:104] Flow reset
    2024-11-25T08:06:52.354643Z [25862/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:55911 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Stream [0xbe00683e;code:62;sub:104] Flow reset
    2024-11-25T08:06:52.354802Z [25862/0x0] [nsg_tcphold.c:1568:tcp_hold_process_control] Failed to get tcp_hold_session ssnptr 0x1de148f0 (hold_state: 3).
    2024-11-25T08:06:52.422138Z [25863/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57702 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:52.654064Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57704 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.510853Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57725 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.558649Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57723 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.559531Z [25862/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57722 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.618653Z [25862/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57721 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.683107Z [25863/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57720 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    

    file growth is minimal:
    less than 10MB per minute
    92.5M 09:14:37 /log/ips.log
    92.5M 09:15:37 /log/ips.log

Reply
  • just for reference after the issue is fixed:

    this is how a regular ips.log would look like

    2024-11-25T08:06:52.172001Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:55909 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Stream [0xbe00683e;code:62;sub:104] Flow reset
    2024-11-25T08:06:52.172116Z [25865/0x0] [nsg_tcphold.c:1568:tcp_hold_process_control] Failed to get tcp_hold_session ssnptr 0x1dcb7b10 (hold_state: 3).
    2024-11-25T08:06:52.189184Z [25863/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:55951 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Stream [0xbe00683e;code:62;sub:104] Flow reset
    2024-11-25T08:06:52.354643Z [25862/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:55911 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Stream [0xbe00683e;code:62;sub:104] Flow reset
    2024-11-25T08:06:52.354802Z [25862/0x0] [nsg_tcphold.c:1568:tcp_hold_process_control] Failed to get tcp_hold_session ssnptr 0x1de148f0 (hold_state: 3).
    2024-11-25T08:06:52.422138Z [25863/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57702 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:52.654064Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57704 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.510853Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57725 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.558649Z [25865/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57723 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.559531Z [25862/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57722 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.618653Z [25862/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57721 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    2024-11-25T08:06:55.683107Z [25863/0x0] [nsg_nse_policy.c:1570:__nsg_error] xxx.xxx.xxx.xxx:57720 to xxx.xxx.xxx.xxx:443: Error from nse: NSE:Internal [0xb000058f;code:143;sub:5] Flow timeout
    UST sessiontbl_get_tuple API returned -1
    UST sessiontbl_get_tuple API returned -1
    

    file growth is minimal:
    less than 10MB per minute
    92.5M 09:14:37 /log/ips.log
    92.5M 09:15:37 /log/ips.log

Children