This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Debugging Web Proxy URL parser / request handler

Hello!

Does anyone know if it is possible to somehow debug URL parse problems?

What I can see in the web proxy logs (with SSL inspection enabled) are errors like below:

2016:08:28-10:42:47  httpproxy[20653]: id="0003" severity="info" sys="SecureWeb" sub="http" request="(nil)" function="http_parser_context_execute" file="http_parser_context.c" line="97" message="Unable to parse a http message of 342 bytes (HPE_INVALID_METHOD: invalid HTTP method)"
2016:08:28-10:42:47  httpproxy[20653]: id="0003" severity="info" sys="SecureWeb" sub="http" request="0x30287800" function="read_request_headers" file="request.c" line="1550" message="unable to parse a http message on handler 88 (Success)"

or 

2016:08:28-11:35:00 httpproxy[20653]: id="0003" severity="info" sys="SecureWeb" sub="http" request="(nil)" function="read_request_headers" file="request.c" line="1581" message="Read error on the http handler 72 (Input/output error)"

I would like to find which URL is causing the parse problem to exclude this URL from SSL scanning.

So far the only method to me is to temporarily disable SSL scanning in the web protection profile, grab the logs and try adding exceptions one by one, but this is a real nightmare! I tried to enable various debugging options via cc http debug (enabled even 'all') but this gives really nothing valuable... Hope anyone knows better method to investigate web proxy issues! Thank you in advance for your help!



This thread was automatically locked due to age.
Parents
  • First all - do you know for sure that this is causing a problem with browsing.  Most of the time these errors can be ignored.

    Determining which request is the culprit from the logs might not be easy - however Support can help by turning on more logging.

    If you have requests that are suspect (for example timestamp is from the same second) you may be able to test.

    From the UTM do something like the following, using the URL that you suspect.

    curl -v http://www.example.com -o /dev/null

    You will see a line like:
    < HTTP/1.1 200 OK

    Most likely the error is associated with this line being out-of-spec, however it can occur with other parts of the header.

    Another method would just be put all the suspect URLs into a file and then use wget -i filename.urllist proxing through the UTM.

Reply
  • First all - do you know for sure that this is causing a problem with browsing.  Most of the time these errors can be ignored.

    Determining which request is the culprit from the logs might not be easy - however Support can help by turning on more logging.

    If you have requests that are suspect (for example timestamp is from the same second) you may be able to test.

    From the UTM do something like the following, using the URL that you suspect.

    curl -v http://www.example.com -o /dev/null

    You will see a line like:
    < HTTP/1.1 200 OK

    Most likely the error is associated with this line being out-of-spec, however it can occur with other parts of the header.

    Another method would just be put all the suspect URLs into a file and then use wget -i filename.urllist proxing through the UTM.

Children
  • Michael,

    Thanks for the answer. I found that most of the cases in which proxy have problems with parsing requests, are those which are not coming from the browsers, but from applications which use HTTPS as their communication protocol. The perfect example can be Microsoft One Drive which recently started to generate massive amount of errors in the http.log like those which I listed in my post. And yes, those errors in most of the cases causes that application either cannot connect to the server or works incorrectly. 

    The problem is that finding what URL is executed by application and causes the issue, is really a nightmare now. In the http.log we can see only information that parser had a problem with parsing request, but without any information about the URL. The only method which I found so far is:

    1. enabling proxy debug by executing

    echo -e "RAW\nset([1,'hdr','req','scan'],'http','debug')" | cc --batch

    2. in the debug log, it is possible to spot IP address of the server to which request was executed by still without any information about URL which was requested by application

    3. disable proxy debugging

    4. temporarily disable SSL scanning in the profile and watching http.log. As the SSL scanning is disabled, the proxy won't crash and will log all the URLs requested by application. By looking into log and comparing with IP address found previously in the debug, it is possible to find the problematic URL.

    5. add the exception for the URL and enable SSL scanning again

    But as I said, it is really time consuming method! It would be much easier if the proxy can log requested URL before it is being tried to be parsed. Then if the operation fails, we could instantly see the problematic URL. As for now - this info is not visible even if I enable ALL in http debug...

    regards!

    Adam

  • Since I don't have a URL to reproduce with, I cannot prove this out myself.  But all http logging should give you enough to match up to the culprit.

    For debug logging using all and ~epoll (that is a tilde.  It turns off super noisy epoll logging).

    One trick that often works to help matching is looking at the the thread and request handler:  These are dynamically used from a pool and will not stay the same for a given transaction, however in a non-busy environment they can be stable enough to help use in tracking.

    2016:08:25-16:37:21 [ 5994/0xa0380400]

    [thread / request handle]

    So what happens if you grep for the request from the message in http.log?

    2016:08:28-10:42:47  httpproxy[20653]: id="0003" severity="info" sys="SecureWeb" sub="http" request="0x30287800" function="read_request_headers" file="request.c" line="1550" message="unable to parse a http message on handler 88 (Success)"

    grep 30287800 /var/storage/chroot-http/var/tmp/httpout | grep '2016:08:28-10:42'

    If that doesn't give you enough, then look at the thread that the request uses and expand the grep.

    egrep '(30287800|5994)' /var/storage/chroot-http/var/tmp/httpout | grep '2016:08:28-10:42'

  • Thanks a lot Michael! That's really a great advice to check on request id!

    I have found that server name for the URL is visible in ssl.c log, so now I am enabling only ssl in debug and can see the URL causing the issue by simply watching both logs in two terminal widnows.

    on 1st windows:

    # echo -e "RAW\nset([1,'ssl'],'http','debug')" | cc --batch

    # tail -f /var/chroot-http/var/tmp/httpout | grep ssl_new

    on 2nd window:

    # tail -f /var/log/http.log | grep handler

    Once again thank you for your support!

    regards,

    Adam

  • Hi Adam, hi Michael,

    Thanks for this great workaround. This information is quite usefull for me. I'm not a Linux guru, could you tell us how to disable proxy debugging after this is enabled?

    regards

    mod

  • Let's do some magic :

    # cc get http debug
    [
              'ssl',
              'epoll'
            ]

    e.g debugging is activ with "ssl" and "epoll" parameter.

     

    Let's disable it:

    # echo -e "RAW\nset([],'http','debug')" | cc --batch

     

    Let's check what's now set for debugging:

    # cc get http debug
    []

    Debugging is now empty.

     

    If you want to remove just one item, you can do it like this:

     

    # cc

     127.0.0.1 MAIN > http

    ( you'll see lots of lines..)

    127.0.0.1 MAIN http > debug@

       0 ssl
       1 epoll
     

    127.0.0.1 MAIN http/debug (ARRAY) > -1

    result: 1
       0 ssl

     127.0.0.1 MAIN http/debug (ARRAY) > ..

    (some lines here ..)

    127.0.0.1 MAIN http > debug@
      
    0 ssl

     Now only ssl is set, another "-0"  will remove ssl from debugging.

     

    Greetings & have fun!