This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Unhappy about WAF logs that cannot be parsed either man or machine!

Imagine that you want to check your WAF logs to figure out if you have any false positives that need to be corrected or any chronic offenders that need to be blacklisted.   Doing so is a nightmare.

There are three types of WAF log records:

  • Regular log Entries that are similar to Web Proxy log files, with this format:
    • Standard Header with space delimiters:
      • Timestamp
      • UTMName
      • Function name (reverseproxy)
    • Message details, using a space delimited list with entries of the form keyword=”value”
  • Detail entries, with a different format
    • Standard Header (same format as above)
    • Message details, with some space-delimited tokens of the form [keyword value], mixed in with undelimited text strings
  • Continuation lines, which are inserted whenever one of the lines is too long. These can be identified because the beginning of the line does not look like a timestamp.  (I have read that this happens at 1000 characters.)   Because the Regular log entries include cookie text, and detail lines can contain OWASP details, continuation lines are pretty common.

The logs need to be parsed in sequence, because the continuation lines need to be appended to the lines that they precede whenever they appear.  Depending on the data, multiple continuation lines are possible.

The Detail entries lack important information, so they must logically connected to the previous Regular Log Entry.   There can be multiple detail entries for one Regular Log entry.   After consolidating the continuation lines, I loaded the file into two SQL tables, with a one-to-many relationship between the Regular line and the Detail lines.   There is no unique identifier, so I use the SQL NEWID() function to generate a unique identifier that can be used as a foreign key to link the data tables.

I am not going to elaborate on the Regular lines, because of the similarities to the web log.   Custom code is needed to parse them, but they at least follow a pattern that makes parsing with custom code possible.

The detail lines are a different story altogether.  After the standard header, the Detail lines have a second header in this format

  • [timestamp] (essentially the same data, but with different format, than the Standard Header timestamp)
  • [keyword1:keyword2] (an indication of the alarm type)
  • [pid 99999] (the process id, where 99999 is a process number, 4 or 5 digits long in my sample data)

After the second header, the contents become pretty random, and can include the following sequence of information:

  • Sometimes: A first message text, with embedded spaces and delimited by spaces.  In the examples that I have seen, there is a trailing colon before the finaling space character.
  • Sometimes: [client 999.999.999.999:9999], to show the client IP and Port.  The word client acts as an identifying token.
  • Sometimes: [username], the name of the user, without an identifying token, or  [99999], the pid number repeated without an identifying token, sometimes [-], whose significance I have not ascertained, and sometimes a code of the form AH99999:   The AH appears to be a constant, the number seems to be exactly 5 digits.  It seems to represent an error code.
  • Finally: another message text, undelimited, which may contain embedded tokens of the form [keyword “value”]   Note that although these token values are quote-delimited, the previous  tokens values for pid and client were not quoted.

This is based on a small sample.   I parsed 6 days of data, identified the 12 different [keyword1:keyword2] pairs in the data, and then examined one log entry of each type.

My conclusion is that we have a log file format which is unreadable to the human eye, and also unreadable to the computer.  One would be inclined to think that it was designed to be unusable, because it is hard to imagine anything being this bizarre by accident.

Of course, if you are subject to most any regulatory scheme, you are probably under a requirement to read or parse these log files at least daily, to verify that the defenses are tuned optimally working correctly, and defending successfully.



This thread was automatically locked due to age.
Parents
  • Hello,

     

    I use a program called FireGen Firewall Log Analyzer 4.

     

    This is the regex they gave me:

     

    Log parsing syntax: (\d\d\d\d):(\d\d):(\d\d)-(\d\d:\d\d:\d\d) (\S+).+?\[(\w+:.+?)\] (.+)

    Firewall type: Sophos UTM

    This is the regex I made:

    (\d\d\d\d):(\d\d):(\d\d)-(\d\d:\d\d:\d\d) (\S+) .+"(\d{6})".*\[msg."(.*?)"

    Firewall type: Sophos UTM

  • Overnight, I concluded that I need another record separator, for lines that do not include a Standard Header, but do start with "[", or possibly "[keyword1:keyword2] ".  Currently they are treated as continuation lines, but I think splitting them into their own data table will simplify parsing somewhat.

    StealthyM, your regular expression will pick up lines with a standard header, but not continuation lines or the lines starting with "[".   Beyond that, the tricky part is learning anything useful from the message body that follows the standard header.

  • After a lot of new code, I have successfully parsed the WAF Logs.  Here is representative data:

    2017:09:05-22:49:58 utmname reverseproxy:
    [Tue Sep 05 22:49:58.058504 2017]
    [authnz_aua:error] [pid 16394] [client xxx.xxx.xxx.xxx:62046]
    [userid] user not allowed to access /_value, referer: server.example.com/_value

    2017:09:05-22:49:58 utmname reverseproxy:
    id="0299" srcip="xxx.xxx.xxx.xxx" localip="xxx.xxx.xxx.109" size="206" user="userid"
    host="xxx.xxx.xxx.xxx" method="POST" statuscode="302" reason="-" extra="-"
    exceptions="value" time="509793" url="/_value" server="server.example.com"
    referer="server.example.com/_value" cookie="-" set-cookie="value"


    2017:09:05-22:56:16 utmname reverseproxy:
    [Tue Sep 05 22:56:16.777243 2017]
    [security2:error] [pid 18567] [client xxx.xxx.xxx.27]
    ModSecurity: Rule be32510
    [id "970003"][file "/usr/apache/conf/waf/modsecurity_crs_outbound.conf"][line "123"]
    - Execution error - PCRE limits exceeded (-8): (null).
    [hostname "server.example.com"] [uri "/default.aspx"] [unique_id "value"]

    2017:09:05-22:56:16 utmname reverseproxy:
    id="0299" srcip="xxx.xxx.xxx.27" localip="xxx.xxx.xxx.113" size="162590" user="-"
    host="xxx.xxx.xxx.27" method="GET" statuscode="200" reason="-" extra="-"
    exceptions="-" time="182169" url="/default.aspx" server="server.example.com"
    referer="-" cookie="-" set-cookie="value"

    Notes:

    • All lines, including continuation lines, have the standard syslog header of timestamp, hostname, functionname.
    • WAF Alarms have a secondary header containing [timestamp], [keyword1:keyword2] [client value] [pid value]
    • If a WAF Alarm appears, it is listed first.   Then the Web Result line follows.
    • The Alarm and Web Result messages are not necessarily adjacent, and timestamps may differ.  In my data, I have seen the timestamps differ by up to 2 seconds.
    • The WAF Alarm must be matched to the Web Result to know whether the alarm blocked the traffic or not.
    • In the WAF Alarm messages, tokens can have these formats:  [value], [label value], or [label "value"]
    • Free text strings can be found by looking for a "] " sequence which is not followed by "["
    • WAF has a [CLIENT value] token which often includes the port.   The port needs to be stripped off to match the alarm to the srcip="value" entry in the Web Result line.
    • [security:error] entries have extra tokens, including hostname and uri.  These tokens can be used to match more reliably to the Web Result log, and its tokens for servername and url.  So I match the Security entries first, then match the non-security entries.

    With the data parsed into a SQL database, I was able to determine that one of my WAF sites was ready to switch from Monitor to Reject, and another site needed to enforce url hardening.

    If anyone wants more detail, send questions via Private Message.

Reply
  • After a lot of new code, I have successfully parsed the WAF Logs.  Here is representative data:

    2017:09:05-22:49:58 utmname reverseproxy:
    [Tue Sep 05 22:49:58.058504 2017]
    [authnz_aua:error] [pid 16394] [client xxx.xxx.xxx.xxx:62046]
    [userid] user not allowed to access /_value, referer: server.example.com/_value

    2017:09:05-22:49:58 utmname reverseproxy:
    id="0299" srcip="xxx.xxx.xxx.xxx" localip="xxx.xxx.xxx.109" size="206" user="userid"
    host="xxx.xxx.xxx.xxx" method="POST" statuscode="302" reason="-" extra="-"
    exceptions="value" time="509793" url="/_value" server="server.example.com"
    referer="server.example.com/_value" cookie="-" set-cookie="value"


    2017:09:05-22:56:16 utmname reverseproxy:
    [Tue Sep 05 22:56:16.777243 2017]
    [security2:error] [pid 18567] [client xxx.xxx.xxx.27]
    ModSecurity: Rule be32510
    [id "970003"][file "/usr/apache/conf/waf/modsecurity_crs_outbound.conf"][line "123"]
    - Execution error - PCRE limits exceeded (-8): (null).
    [hostname "server.example.com"] [uri "/default.aspx"] [unique_id "value"]

    2017:09:05-22:56:16 utmname reverseproxy:
    id="0299" srcip="xxx.xxx.xxx.27" localip="xxx.xxx.xxx.113" size="162590" user="-"
    host="xxx.xxx.xxx.27" method="GET" statuscode="200" reason="-" extra="-"
    exceptions="-" time="182169" url="/default.aspx" server="server.example.com"
    referer="-" cookie="-" set-cookie="value"

    Notes:

    • All lines, including continuation lines, have the standard syslog header of timestamp, hostname, functionname.
    • WAF Alarms have a secondary header containing [timestamp], [keyword1:keyword2] [client value] [pid value]
    • If a WAF Alarm appears, it is listed first.   Then the Web Result line follows.
    • The Alarm and Web Result messages are not necessarily adjacent, and timestamps may differ.  In my data, I have seen the timestamps differ by up to 2 seconds.
    • The WAF Alarm must be matched to the Web Result to know whether the alarm blocked the traffic or not.
    • In the WAF Alarm messages, tokens can have these formats:  [value], [label value], or [label "value"]
    • Free text strings can be found by looking for a "] " sequence which is not followed by "["
    • WAF has a [CLIENT value] token which often includes the port.   The port needs to be stripped off to match the alarm to the srcip="value" entry in the Web Result line.
    • [security:error] entries have extra tokens, including hostname and uri.  These tokens can be used to match more reliably to the Web Result log, and its tokens for servername and url.  So I match the Security entries first, then match the non-security entries.

    With the data parsed into a SQL database, I was able to determine that one of my WAF sites was ready to switch from Monitor to Reject, and another site needed to enforce url hardening.

    If anyone wants more detail, send questions via Private Message.

Children
  • I would like more information. PM sent

  • For the sake of the community, I will provide general information on my approach.

    I do everything with SQL Stored Procedures an SQL Scalar Functions.

    1) The most important discovery was the OPENROWSET BULK command, which turns a text file into a SQL table.  It is most often used for Bulk Insert, for example:

    Insert (field1, field2) from openrowset ( bulk 'file','xml)

    but it can be used anywhere a table is used, so you can test your parsing process with a SELECT statement, like this:

    Insert (field1, field2) from openrowset ( bulk 'file','xml)

    The general syntax is this:

    OPENROWSET( BULK 'LogFilePathandName.log',
    FORMATFILE = 'XM:FormatFilePathAndName.xml' ) as A

    The XML format file syntax is documented on the Microsoft support site, and is pretty easy to learn.   The first block of the file defines the parsing rules to break the text file into fields, and the second block of the file maps those fields to names and SQL data types.

    Note that the filenames In OPENROWSET are text constants, which is a nuisance.  If you want the filename to be a parameter of your stored procedure, you have to build a dynamic SQL Command into a text variable, and then use EXEC(@variablename) to execute the dynamic  the command.

    2) Because most of the log file is unstructured, my XML file is configured so that the message, everything after the first few space-delimited fields, is loaded into a single data field.  SQL text strings can be 8000 characters long.

    3) I use SQL Functions to parse the message.   

    (a)For log entries with format keyword="value", I have a function where the keyword is the parameter and value is the returned result.  It does a charindex() function to find the keyword. 

    (b) For the WAF log entries that use brackets, I have a function that counts tokens, and extract a token by position number, assuming a generic pattern of

    [token1] token2 [token3] token4 [token5] token6 ...

    where any of the above tokens can have embedded spaced.   In reality, there is not always a token between the brackets, but it makes the code simpler to allow for the possibility.   For the tokens in brackets, I also discard the brackets.

    (c) Since some of the bracketed tokens have an embedded keyword, I have another function that searches for [keyword value].  It takes keyword as a parameter and returns value as the result.   Since some values have quotes and some do not, I also check for and discard the quotes if present. 

    4) Right now, I don't care about keeping log files forever, so periodically I want to discard old data.   The SQL "TRUNCATE TABLE tablename" command sets a table to zero records in one step.  The "DELETE * FROM tablename" command deletes records one at a time, including all of the journal management that goes with each delete.  If you have 10 million Web Proxy log records to delete, this will be very slow and may consume all your log space.   As a result, instead of deleting individually, I truncate the table and start over.   If you only want to discard some data, the best approach would be to copy the data that you intend to keep to a new table, truncate the primary table, then copy the surviving data back to its original location.

     5) For similar reasons, my bulk insert procedure loads data into a temporary table first, which is truncated before the load begins.   If the procedure fails after loading only some data, my primary table is unaffected and I don't have to worry about cleanup.   After the data is successfully loaded into the temporary table, I copy it to the primary table because i can now be sure that it will transfer cleanly.  Transaction rollback might make this approach unnecessary, but I did not want the risk.