Bad URLs Categorizations Error on Web

Hi Sophos

Customers of XG Firewalls are reporting a bad url clasification, for example, this sites are detected as Porn/Sexuality/Nudity in other Sophos products but the clasification in XG firewall is bad.

Support say me that "Send a URL Request" but the troubble is that these URLs are Porn.

Please check.


  • In reply to Big_Buck:

    I don't want to get into a deep discussion on whether or not it is better to categorize that site as a Download site or Information Technology.
    My point is that it is easy to understand why it might be categorized as one versus the other.  Especially using automatic categorizers who crawl through the websites.
    However when I think about it, I doubt this was an error in the automatic categorizer.  When we categorize a website (, we generally have no reason to categorize subdomains differently.  Because at that point the subdomain is categorized (inherit from the parent) we don't run the automatic categorizer on it.  There is little reason we would look at changing the category in a subdomain - unless there was a specific complaint from a customer about it being wrong.  So my guess (and it is only a guess) is that Sophos now has two customers - one who in the past saw if as Information Technology and put in a request to recategorize it as Download, and another customer who sees it now as Download and says it should be Information Technology.
    Therefore the statement "If you can't categorize Intel properly, you have no business to do whatsoever in WEB filtering" is unfair - we are juggling multiple customer complaints here and there are perfectly valid arguments for both sides.
    In any case, it looks like that site has been set to Information Technology now.
  • In reply to Michael Dunn:

    I am generally satisfied with web categorization.  In the event I do run into false positives or whatnot, I have found that the categorizations are usually corrected within 24-48 hours after submitting them for recategorization.  I also do not seem to have the "Uncategorized" problem reported earlier in this thread, for instance yesterday 0.26% of my requests were categorized as "None."  I'm perfectly ok with that.  

    I'm also unclear why you would want your users downloading and installing their own updates from Intel but to each his own I guess.

  • In reply to rfcat_vk:

    Please note that this chart DOES NOT show that your users visited 22k distinct uncategorized websites.

    What this chart shows is that across your network, there were 22.37k requests to URLs that were not categorized. Those requests could well have been to a very small number of different sites.

    It also shows that the average size of the responses to those requests was a little over 1kb. 

    So this is just as likely to be a large number of small requests to a single site that was uncategorized - for example, an app that does many polling requests over time.

    Please note also that any web traffic that is subject to a Web Exception excluding it from policy checks may also appear as Uncategorized in the logs and reports. The product tends not to do category lookups for sites where it knows that the category will not be required for making policy decisions. This will include any Sophos endpoint products running on machines on your network that do various types of lookups for data updates, URL security lookups, management or policy updates and product version upgrades.

    You should be able to get a better picture of the actual number of uncategorized sites by going to Reports > Dashboards, scrolling down to the Web Categories chart, then click on the 'None' label in the table below. This will take you to a filtered report, which includes a table called 'Web Domains'. This will show you a list of the domains that make up the Uncategorized block and how many Hits (HTTP Requests/HTTPS connections) to each were recorded.

    I agree we may have a problem, but I think that the problem here is more likely to do with how the data is recorded and presented in reports. If your further investigations suggest that there really are thousands of different domains being uncategorized, then we will look into it further.

  • In reply to Michael Dunn:

    in XG we are not able to override URL and put them in another Category. This is very useful for Reporting.

    XG web filtering catching level, compared to UTM, is still far away. I hope you will improve it quickly as for the moment we spend time to submit URL to Sophos Website.


  • In reply to RichBaldry:

    Hi Rich,

    you are correct, the sites are all Sophos which have default exceptions in web.

    The above screenshot is from today's activity from one device.


  • In reply to Michael Dunn:

    Michael Dunn are you serious ?  Your seriously think that Sophos is doing OK at failing to categorize IBM, Banks, Intel, ET.C. ? 

    I have a Checkpoint appliance at home that cost a fraction of our Sophos solution that never failed to categorize properly in month.  I had it side by side with a Sophos XG105W so I could compare them for month.  Checkpoint have not shown a single mis-categorization in months.  Compare to Sophos XG at the office where users at calling me many times a day.  I could also compare to Symantec WEB filtering.  Symantec'hit s***t old "SpyWall" AND Symantec new waffle††† b************† Coat.

    It is not a matter of personal taste.  Intuition. Or guesses.  I know XG WEB filtering does not match the "average" performance of the industry.  Has nothing to do with being fair or not.

    Paul Jr 

  • In reply to Big_Buck:

    Hi Paul,

    What percentage of Support calls does Sophos get about bad categorization?  Almost none.
    When Sophos goes to partners and asks for what features do the partners want, categorization barely gets mentioned.
    In the forums, what percentage of posts are about categorization?  More than other areas, but still low.

    I think that, in the grand scheme of things, Sophos does pretty good at categorization.  I know that it is very hard for me to tell the other parts of the company "categorization is an issue we need to solve" since they come back "give us the evidence" and there is none except a few people complaining in the forums.  Sophos is like any company - it will spend time and money in the issues that have the most complaints, that will make the biggest sales.  Right now, categorization isn't that.

    I am not saying you are wrong or that you shouldn't complain.  I'm saying that you should gather all your complaints about XG, prioritize them, then complain in every feedback mechanism you can.  And include evidence, statistics, examples.

    Lets take two instances from this thread where evidence, statistics, examples and investigation have worked:

    rfcat_vk has been complaining about 20,000 uncategorized websites a day for a long time, when it is actually 20,000 hits to a single website that is uncategorized because there is a "do not categorize" exception in place for it. was (most likely) categorized like that deliberately because a customer complained and wanted it that way.  Now you complained and want it another way and it was changed.

    So you saying "you seriously thing that Sophos is OK at failing to categorize [...] Intel?  I reject your premise that we are failing to categorize it.  Show me evidence and examples - that we can investigate and fix - not rants.  And show me the numbers of admins who are complaining about it so that it can receive higher priority.

  • In reply to Michael Dunn:

    Yes Michael,

    you are correct, I have been complaining for awhile about the high count. What do other admins see from their servers talking to Sophos central about anti-x updates, how do they stop the count?

    I suspect my issue is caused by the endpoint protection installed on the server when I had the UTM running and managing end points. I suspect I will have to fire up the UTM when I can find the disk to update/ remove the software even though the server shows it as being up to date? Of course this leads to another question, why does the server talk to the Sophos sites so often? This server does DNS, and photo management all internally.


  • In reply to Michael Dunn:

    Hello M. Dunn

    I do not have numbers to reach a statistical conclusion.  And that's not close to be an argument anyway.  If you want numbers that turn wrong, 88 millions Germans followed Hittler.  No.  I just have a permanent access to WebSense, Symantec and Checkpoint web filtering.  And a lot of experience and common sense.  I can compare Sophos every hour of the day if I want.

    The screen shot I posted last week is crystal clear to whoever wants to see.  Intel was wrongfully categorized.  Now if one, two, or even one hundred persons complain to have a web site category changed for something as large as, and they succeed, that simply means mechanisms behind Sophos categorization are flaw and dead broken.  WEB categorization shall not be based on a popularity contest  !!!  And particularly for WEB sites as gigantic as Intel.  That's how categorization was done 15 years ago.

    Paul Jr 

  • In reply to Big_Buck:


    We understand you're not satisfied and we have allowed you to express your dissatisfaction. We have had professional services and support managers attempt to work with you on a professional level. We would like to help and assist you regarding the issues you are experiencing. However, your ranting is not helping others in a constructive manner. If you would like to have a dialogue, please raise a support case and we will work with you. Work with us to make it better for you and everyone else.


    FloSupport | Community Support Engineer

  • In reply to FloSupport:

    Dear Sophos Team,

    We are expecting some solution for this issue. This is the Genuine problem in XG 17 and Sophos should look into this issue very seriously.

    Now a day’s many sites categorizations are not done properly and category get randomly change this Is what we observed.

    Sample case:

    This Site, in Sophos it is showing in “Job” category. But I believe this site should be in educational web category.

    I cross check with Bluecoat, Fortinet, SonicWALL website category portal and it shows this website in educational category. Why in Sophos it is showing in Job category? This is by mistake or any logic behind this?

    Similar case with: also. It should be in social media or media sharing but in Sophos it is showing in photo gallery category.

    Many more are there…

  • In reply to Dilip Patel:

    I'm sorry but I also think that categorization has a lot of false entries. I just took a look at category 'Jobs Search':

    The first 5 entries are definitly wrong. Number 6 is about finding new staff. 7 and 8 are again completely wrong.

    I see this every day and currently looking at categories and reports based on it makes no sense.

    From my point of view this feature is completely broken.

  • In reply to Jelle:

    Hi  and 

    I would agree for re-assessment for some of those websites.

    Specifically: -> educational -> general business   -> general business -> general business -> advertising is debatable as it is a photo gallery, but more of a social-media and social networking type category.

    The other items in the list seem to be valid Job sites.

    I would advise to please submit these and future websites for re-assessment via our submission page here. Please PM me if you run into issues submitting via our page.


    FloSupport | Community Support Engineer

  • In reply to FloSupport:


    we can submit the sites for re-assessment if we know site is wrongly categorized (Reported cases).

    I just go through 1 days logs and it clearly seems site list is very big and do you think this is the feasible solution to submit these many websites every day for re-assessment?

    you should think from user prospective, many sites category is allowed as per the policy but due to wrong categorization site is not opening and user assumed site is blocked by IT as per the policy and they manage their work through other workaround.

    Here is list of sites which is wrongly categorized.