Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Reporting and Logging: we need a big improvement in these 2 fields

With the v18, a lot of features and improvements have been made. Kudos to the devs team and the PM. The following improvements are very nice:

  • UI is faster than before (once the Control Center has terminated to refresh all gadgets and graphs)
  • DPI. It is still not mature but a nice step forward
  • Radius timeout
  • Enterprise NAT
  • SD-WAN policies
  • and many other things

but dear Devs and PM, logging and reporting is still poor, very poor. Take a coffee and follow me:

  • drop-packet-capture command: the commands, for each packet, takes at least 9 lines on a 13" screen. See the example: console> drop-packet-capture
    2020-02-24 08:27:15 0102021 IP 94.177.193.151.443 > 192.168.1.101.58612 : proto TCP: R 1337150070:1337150070(0) checksum : 19633
    0x0000:  4500 0028 ed0b 4000 3406 776e 5eb1 c197  E..(..@.4.wn^...
    0x0010:  c0a8 0165 01bb e4f4 4fb3 4a76 0000 0000  ...e....O.Jv....
    0x0020:  5004 0000 4cb1 0000                      P...L...
    Date=2020-02-24 Time=08:27:15 log_id=0102021 log_type=Firewall log_component=Invalid_Traffic log_subtype=Denied log_status=N/A log_priority=Alert duration=N/A in_dev=Port2 out_dev= inzone_id=0 outzone_id=0 source_mac=74:da:da:f4:18:6f dest_mac=00:e0:b6:14:b4:21 bridge_name= l3_protocol=IPv4 source_ip=94.177.193.151 dest_ip=192.168.1.101 l4_protocol=TCP source_port=443 dest_port=58612 fw_rule_id=N/A policytype=0 live_userid=0 userid=0 user_gp=0 ips_id=0 sslvpn_id=0 web_filter_id=0 hotspot_id=0 hotspotuser_id=0 hb_src=0 hb_dst=0 dnat_done=0 icap_id=0 app_filter_id=0 app_category_id=0 app_id=0 category_id=0 bandwidth_id=0 up_classid=0 dn_classid=0 nat_id=0 cluster_node=0 inmark=0x0 nfqueue=0 gateway_offset=0 connid=0 masterid=0 status=0 state=0, flag0=0 flags1=0 pbdid_dir0=0 pbrid_dir1=0
    Too long, too much lines. So, in my opinion, you could remove all the variables that are =0; print in a different colour source_ip, destination_ip and dest_port; zone id should report the name and not the ID. You need to run a sql command to find the corresponding ID, really? I have already some customers with 4 additional zones and troubleshooting with drop-packet-capture is impossible as the customer has tracked in notepad zone_id=zone name; in_dev and out_dev= please report here the interface name and not the Port number (same reason as the previous point).
  • Web exceptions: try this exercise. Install skype on your computer and try to understand which are the domains to allow only from the logging. A nightmare! To understand why XG was blocking file trasfer via skype, I went to another brand to understand which domains to unlock. On pure XG logging, live logs does not report nothing; drop-packet-capture not all the domains; tcpdump you get mad with such connections;
  • Please separate logs for VPN, Wi-Fi, DHCP, DNS requests (at least). Everything is inside SYSTEM. A mess. You can use filters, ok, but it is not straightforward for many customers. You expect to have VPN logs and not SYSTEM, filters inside the component.
  • Still many linux command lines tools to use. Conntrack for example.
  • Ability to put services in debug mode via UI
  • Ability to understand from the UI what each service does. For example, to troubleshoot WAF, the service is reverseproxy, so in the UI (once we can put all services in debug mode, please add a column with specification or details like: this services allows you to get more information for WAF module.
  • Ability to search logs based on time range and not only last 10 minutes and so on
  • Customized Control Center: one of the most request I received is:How can I understand the current bandwidth utilization from Control Center? This is a basic feature
  • Proper logging when you cannot delete the objects (Where they are used is a good help)
  • Proper logging for CA and Certificate upload issue.

Reporting:

  • report based on zone and on port. We need to find bytes sent/received per interface and per zone and not only per firewall ID. Like this:
  • Concurrent connection list. This is a straightforward report to compare and understand anomalies in the network.
  • Web usage report. Now you cannot have graphs per Bytes (even if you select sort by: byte)


This thread was automatically locked due to age.
Parents Reply Children
  • Just to add a real case scenario:

    I am using XG v18 with HTTPS enabled. Even with the v17+ I had issue to download files via Skype Application. I am able to send files to my contact but not to download. Few weeks ago I fixed the issue by looking at another competitor forum to understand which domains to put into the exception list because from logging, you are completely blind. After creating the proper web exception list, it worked for few weeks and now download does not work anymore.

    I am sure that another domain is missing from the exception list. So let's start as a normal user to find the issue:

    • I opened the log viewer and nothing in the IPS, Application filter and into Web or TLS logs.

    So I proceed as an advanced user:

    • I enabled the debug on awarrenhttp service and nothing comes out that is helpful.
    • Drop-packet-capture is completely empty.

    Then ad expert:

    • tcpdump src 192.168.0.8 and port 443 | grep -v amazon

    I found several IP in the list and then I used the whois ip to better understand about the IP, since the DNS is not able to resolve them. After adding 3 Ip (maybe only one was the correct one), I was able to download the file sent yesterday by my colleague.

    Is this possible? Is the logging not worst enough from users and advanced users?

    Also:

    I would like to which IP I added to the web exception list so I went to Admin Log from Log viewer and it only says:

    "Web Filter Exception 'HTTPS URL Exceptions' was updated by 'admin' from '192.168.0.8' using 'GUI'"

    Ok, but which are the modification that "admin" performed on the object?

    A mistery!

  • Let me upload another case scenario. My Onedrive is not getting sync on my mac.

    Here some configuration:

    console> show http_proxy
    <ENTER> no further known parameters  
    console> show http_proxy
    HTTP add_via_header: on
    HTTP core_dump: off
    HTTP relay_invalid_http_traffic: on
    HTTP connect_timeout: 60
    HTTP tunnel_timeout: 300
    HTTP client_timeout: 60
    HTTP response_timeout: 60
    HTTP proxy_tlsv1_0: on
    HTTP captive_portal_tlsv1_0: on
    HTTP captive_portal_x_frame_options: off
    HTTP block_proxy_loop: off
    HTTP disable_tls_url_categories: off

    I am using Proxy mode on v18 (a bug on DPI does not allow me to use DPI). Here the steps to find the issue but the logs do not tell why the domain is blocked.

    drop-packet-capture is empty. Firewall logs are green and everything is green inside web filtering and:

    some packet capture. And going ahead:

    I found from awarrenhttp_access.log the following:

    1583135089.494173940 [ 9607/0x7fe7f42f1400] fwid=1 fwflag="VS" iap=12 aap=4 conn_id=995654848 id="0001" name="http access" action="pass" method="<unknown>" srcip="192.168.0.8" dstip="40.67.254.36" user="" statuscode=200 cached=0 trxlen=3379 rxlen=11470 url="skydrive.wns.windows.com/" referer="" type="" upload_file_name="" upload_file_type="" download_file_name="" download_file_type="" authtime=0 dnstime=11 cattime=56924 avscantime=0 fullreqtime=2684425629 ua="" activity="" av_transaction_id="" categoryname="Skype web exception list" category="1026,29" app_id=0 app_name="None" app_cat="None"  exceptions=""
    1583135091.444673254 [ 9607/0x7fe7f42f1400] fwid=7 fwflag="VS" iap=12 aap=4 conn_id=91327936 id="0001" name="http access" action="pass" method="POST" srcip="192.168.0.8" dstip="13.107.136.9" user="lferrara" statuscode=200 cached=0 trxlen=2237 rxlen=2409 url="mydomain-my.sharepoint.com/.../Authenticate referer="" type="application/json" upload_file_name="" upload_file_type="" download_file_name="" download_file_type="" authtime=0 dnstime=63919 cattime=465 avscantime=2162 fullreqtime=212511 ua="Microsoft SkyDriveSync 19.232.1124.0008 ship; Mac OS X 10.15.3" activity="" av_transaction_id="" categoryname="Skip HTTPS Checks Web Categorh" category="1025,1026,29" app_id=1500 app_name="Sharepoint" app_cat="General Business"  exceptions=""

    The logs are not saying why the https://skydrive.wns.windows.com/ and mydomain-my.sharepoint.com/personal/luciano_ferrara_mydomain_it are blocked. These 2 domains were responsible for onedrive to not sync.

    After adding them inside the web exception list.From the screenshot above, you can see that onedrive is now syncronized. From the screenshot below, I searched the domain inside all files inside /log folder and this is the result:

    Very HELPFUL! So tell me again that XG logging is clean and understandable! Competition points out why the domains are blocked on UI, on XG you do not even log inside advanced shell why

    the domains were blocked!

     

     

  • Hi folks,

    last couple of days i have been trying to use logging to debug some issues I thought I had with SD-WAN and now with an IoT device.

    The SD-WAN  policies would pass traffic for some ports in the firewall but not others. Nothing I could locate in the logs provided any useful debugging material, so I asked the forums and received lots of answers that I had the configuration wrong. So after a restore to do additional testing nothing worked with SD-WAN policies anymore but nothing in the logs helped. PCAP indicated there wasn't a NAT rule by showing NAT = 0 but that could also be the default NAT rule which is or was labelled 0.

    During EAP I had an issue with one of my IoT devices and with Michael D's help we debugged the issue. Now having performed a restore from the 26th which is actually the last day this IoT controller talked to the world according to the controller, according to XG logs there were transactions all the time. I have spent the last two hours trying to determine why the controller is not online. The logs are no help, they show 443 as being used. I added 443 to the firewall rule and the controller failed, removed allocation filter, IPS no progress, logs still showing connections. Added allow al to thawed policy, still no functioning connection.

    Added the application allow all and IPS, back and removed https from the rule. The controller is on-line again. Conclusion is the device is using port 443 but not correctly as https. Next step is how to improve the network security for the controller, at the moment it has its own rule using DPI.

    Logs not much use.

    Ian

     

    A little issue with the SD-WAN video, it shows a troubleshooting tab, which I could not find on my V18GA.

    XG115W - v20.0.2 MR-2 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.

  • Another area requiring urgent attention is Intrusion Protection, I will not repeat what I posted during EAP in detail. The IP tab shows activity, when you click on one of the tabs you are shown a report with no details.

    Ian

    XG115W - v20.0.2 MR-2 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.

  • Let me ad some more.

    I disabled the IoT device using the clientless active/not active function. Then with the re-enabled connection, still nothing the logviewer.

    I found what was wrong with the IoT device but not using the logviewer but the connection list which showed a port being used. Once I applied a refined firewall rule with the missing service and https the device works correctly.

    The port went missing in one of my restores.

    Ian

    XG115W - v20.0.2 MR-2 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.

  • Let me add another real scenario happened today.

    Customer was looking to replace the appliance built-in certificate with a new one bought from a trusted, external CA.

    Customer generated the CSR on XG and uploaded the request to the external CA, then he received the CA and the corresponding certificate. Customer than uploaded the CA in the corresponding CA section and the Certificate in the Certificate section.

    All good, certificate CA issuer green and everything Ok. Here the pain:

    Customer went to SSL VPN settings to change the Certificate, selected the new Certificate, apply and voilà che SSL VPN is stopping and then SSLVPN service is dead.

    The reason? Who knows? Nothing in the UI as expected. The only way to get a proper debug was to go in advanced shell and read the sslvpn.log file. The private key was incorrect or missing.

    This was not true and the problem at the end was another. The intermediate CA was missing but nothing in the log. Just from experience that if you do not upload all the CA chain, the certificate is invalid but XG does not say nothing about, it should even prevent the Certificate to upload completely, instead it did.

    I have another example of adding an extra ALIAS IP failed too, but I have to collect the corresponding logs. I will do.

  • Hi folks,

    still working on the security camera access.

    The security camera works very well when you use the same network or the check short display function.

    The issue comes to getting the application to work on other networks behind the XG.

    1/. the XG did not and still does not log all connections or connection attempts.

    2/. I have added the extra firewall rule as per the bug advice for firewall 0.

    3/. I have found by experimentation some of the range of ports used by the application.

    4/. some of the ports try to connect to servers in China when they show in the log, I have China blocked for outgoing.

    5/. If I reduce the port range from over 800 ports permitted then the application fails to connect.

    6/. what I do see in the logs lots of broadcast messages and failed ICMP errors.  know what the broadcast messages are for, but why do they appear when the other more useful ports don't?

    7/. When reducing the port range the application takes longed to login or fails, maybe 2 times out 3 or just plain fails, no entries on the logviewer regarding additional ports.

    8/. I suspect hat the application does a port search, possibly linear.

    9/. why do I want to reduce the ports assigned because when running realtime viewing of the camera each port passes 10-20MB of data which appears in the daily reports as unclassified.

    10/. does not matter whether I use DPI or proxy, I prefer DPI because it covers a larger range of TLS use.

    11/. for some reason I do not understand the application tries to connect to the firewall using the range of ports it connects to the external camera server on.

    12/. this issue happens on iPad and MBP, the application is not available for MS machines. There is an android version.

    Ian

    XG115W - v20.0.2 MR-2 - Home

    XG on VM 8 - v21 GA

    If a post solves your question please use the 'Verify Answer' button.