Non-human traffic is the generation of online page views and clicks by automated bots, rather than human activity. This automated bot traffic often acts with malicious intent to steal your content, get access to confidential data about your company and your users, or skew your analytics.
However, not all non-human traffic is bad. Crawlers, price scraping and search engine bots are seen as good bots.
Putting a stop to bad bots generating non-human traffic can be a challenge. Newer and more sophisticated bots are much harder to detect. In most cases, a bot mitigation solution is the most efficient and accurate way to detect and mitigate malicious non-human traffic.
Frequently Asked Questions about Non-Human Traffic
How does non-human traffic affect my website?
Malicious non-human traffic can slow down your site, clog up your robots.txt file and even prevent crawlers from accessing it. It is important to recognize the impact that alternative content such as a 503 status code has on SEO rankings.
How does non-human traffic affect my SEO rankings?
Non-human traffic is usually easily identifiable by search engine bots, especially when it generates a high number of requests for alternative content. If you are found to be generating a large amount of non-human traffic, Google and other major crawlers will penalize your site. This can significantly impact your SEO rankings in organic search results.
How do I know what source causes non-human traffic on my website?
A good bot mitigation solution can detect the type of bad bot that is accessing your site from the user agent information included in each request. Many modern solutions also look at behavioral indicators to tell if an IP address or domain is sending too many requests without showing a user agent.
This information can be invaluable in understanding the impact that bad bots are having on your site and whether you have a malicious problem to deal with.
If you’re seeing an increase in traffic or requests from crawlers, price scraping or search engines, then this may not be a bad bot issue at all. In fact, some vendors even offer tools for crawling and indexing content at scale as part of their bot mitigation solution offerings.
The best way to know for sure is by running tests and comparing your current traffic levels with historical records over time so you can spot changes that indicate non-human activity before it becomes a serious problem.
How do I know how much non-human traffic I receive?
Your analytics or webmaster tools dashboard will usually display the number of requests per hour directly under the Referrers section (in the case of Google Analytics) Please make sure that you have enabled the tracking of requests from crawlers.
How do I know if one source is generating non-human traffic on my website?
It’s important to break down traffic into its different sources (keywords, referring domains, IP addresses). A good bot mitigation solution can report on a per-request basis which means that you will be able to see which page or domain is sending almost all of your bad bot traffic by looking at the “Top Sources” report in your console or dashboard. This makes it easy for you to focus on specific issues and assess the risk they may pose.
Non-human traffic detection tools are becoming more sophisticated every day as new challenges arise. For example, browser fingerprinting can enable bots to bypass traffic filtering rules and remain undetected.
Making sure that you have the best solution available to deal with these issues can help you stay ahead of the game.
How does non-human traffic affect my revenue?
Non-human traffic is often used by hackers and attackers as a way to generate revenue via pay per click advertising or other exploitations. In some cases, bots will even intentionally misspell keywords in order to manipulate ad SERP positions and increase ROI on their efforts. There are also examples where malicious actors hijack users’ computers or devices then have them visit websites for commercial purposes such as generating clicks on ads. You’ll need bot mitigation solutions that allow you to detect and stop these threats before they become a problem.
Do I need to block all non-human traffic on my website?
Some websites, especially those run by businesses that have no special features or forms for visitors to fill out, do not mind if search engine bots or price scraping software access their content because they can then use this data to improve their own rankings and offerings.
Non-human traffic doesn’t always have a negative effect on your site — in fact, it may be boosting your SEO performance right now. But you should only allow crawlers and other trusted sources to access your website since allowing malicious behavior from compilers and scrapers could lead to more serious problems down the line.
If you don’t like the idea of having search engines indexing pages on your website, you can simply request that they remove any unauthorized copies of your content as soon as possible.
Can I use another tool besides a bot mitigation solution to stop non-human traffic?
If you are using standard web server security measures, such as a firewall, you can block bot requests at the network level. However, this is not very effective because it usually blocks all bots and crawlers which means that your site may be left without sufficient content indexing and thus hurt your SEO efforts in the long run.
Moreover, blocking non-human traffic at the network level also affects real human visitors. Many websites use crawlers for tasks such as checking links on their website or presenting search results with images. So if you try to block too many types of requests based on criteria that are only known to the spammer (or hacker) then you could impact your own visitor experience and even lose legitimate business opportunities with your clients.
The best way to block malicious traffic is on a per-request basis, which means that you can allow crawlers and bots access to your content while blocking other sources at the same time.
Non-human traffic detection tools profile incoming data (e.g., IP addresses, HTTP headers) to detect non-human behavior and/or known bad actors before they reach your website. This allows you to perform granular request filtering without affecting genuine human visitors because their requests are not blocked in real-time — instead, they are logged or mitigated on an event-driven basis by the platform after they have already reached your website server (on average).
These solutions also track trends in order to predict future attempts and alter their blocking rules accordingly to minimize false positives.