Is your website traffic real, or bots?
One overseas datacenter was sending us almost as many visitors as the entire United States. None of them were real. Here is how we caught it, and how you can check your own site in about two minutes.
Some of the traffic in your analytics is not people. It is bots: automated scripts, scrapers and scanners running out of datacenters. On many sites they make up a large share of recorded visitors, and they quietly inflate your numbers until your reports no longer match reality. The good news: you can prove it in about two minutes, and you can block the bad bots without losing the good ones.
We know because it happened to us.
How we found that 40% of our traffic was bots
We were reviewing Google Analytics for our own site and one country jumped out. Singapore was our number two country by users, almost tied with the United States. We are a US marketing agency. We have no business in Singapore. That is a red flag, not a growth signal.
So we ran one quick test that settled it. It is the same test you can run right now.
The two-minute test you can run yourself
Real visitors engage with a page. Bots load it, fire your tracking script, then leave in under a second. That gap is the tell. In Google Analytics, put two reports side by side:
- Top countries by users. Note which countries send the most visitors.
- Average engagement time by country. Note how long each country actually stays.
Then compare. If a country ranks high by user count but sits near the bottom, or does not appear at all, on engagement time, that traffic is almost certainly bots.
In our case Singapore was number two by users and did not even make the top ten by engagement time. Its sessions averaged close to zero seconds. That is not a person reading your page. That is a machine.
A country that sends thousands of visitors who each stay zero seconds is not an audience. It is noise.
What that bot traffic actually is
When we looked at the raw requests hitting the site, the source was obvious:
- Datacenter scripts. One server sent around 3,000 requests in a single day using curl, a command-line tool. No real person browses with curl.
- Scrapers. Automated crawlers harvesting content, many running out of cheap cloud providers overseas.
- Vulnerability scanners. Bots probing for WordPress login pages and exposed config files, on a site that does not even run WordPress.
None of it was a customer. All of it was counting as traffic.
Do not block the good bots
Here is the part most people get wrong. There is a category of bot you absolutely want: the AI crawlers behind ChatGPT, Perplexity, Google's AI answers and Claude. Getting cited by those engines is one of the most valuable things a site can do in 2026.
Those crawlers identify themselves and reach your site from known US datacenters. They were not part of our junk traffic at all. So the fix is not a blunt block. It is a rule that lets the good bots through first, then filters the rest. If you want the deeper version, we wrote about how AI search decides who gets cited.
How to fix it
If your site runs on a modern edge host, you can act on this in minutes. The approach we use, in order:
- Allow the AI crawlers first. A top-priority rule that permits GPTBot, PerplexityBot, ClaudeBot, Google, Bing and Apple's crawlers, so nothing below can ever block them. This protects your AI-search visibility.
- Block the noise. Challenge scripted tools like curl, block regions with no legitimate business and obvious bot dominance, and block the worst repeat offenders by address.
- Use your platform's built-in protection. Modern hosts already stop a lot of automated abuse. Tune around it instead of reinventing it.
Everything above went live on our sites the same afternoon, with no effect on real visitors and no effect on the AI crawlers we want.
Frequently asked questions
How much of website traffic is bots?
It varies by site, but a large share of automated traffic is common, especially from datacenters. On our own site, close to 40% of recorded users in one week were zero-engagement bot sessions from a single overseas datacenter.
How can I tell if my website traffic is bots?
Compare your top countries by users against average engagement time by country in Google Analytics. A country that ranks high by user count but has near-zero engagement time is almost certainly bot traffic, not real visitors.
Will blocking bots hurt my SEO or AI search visibility?
Not if you do it correctly. The crawlers that drive Google rankings and AI citations, such as Googlebot, GPTBot, PerplexityBot and ClaudeBot, identify themselves and should be explicitly allowed first, before any blocking rules. Done right, you remove the junk and keep every crawler that matters.
Why is bot traffic a problem if it is just numbers?
Bots corrupt your reporting so decisions are based on fiction, they waste hosting resources and bandwidth, and scanners actively probe your site for vulnerabilities. Removing them gives you accurate data and a smaller attack surface.
What is the difference between good bots and bad bots?
Good bots are search and AI crawlers that index your content so people can find you. Bad bots are scrapers, spam bots and vulnerability scanners that provide no value and often cause harm. The goal is to welcome the first group and block the second.
The takeaway
Traffic going up is not automatically good news. Before you celebrate a jump, check whether it engages. Clean analytics is the foundation of every good marketing decision, and most businesses are making decisions on numbers that are part fiction.
If you want us to run this check on your site, it is one of the audits we do every month for the sites we manage. Get in touch and we will tell you exactly how much of your traffic is real.
How we checked this
- Google Analytics 4. Compared active users by country against average engagement time by country over a representative one-week window on our own property.
- Edge request logs. Reviewed top networks, IP addresses, user agents and request paths on our host's firewall to identify datacenter, scraper and scanner traffic.
- AI-crawler verification. Confirmed the crawlers behind ChatGPT, Perplexity, Google and Claude reach the site from known US datacenters and were not part of the blocked traffic.
- Figures. From our own live properties, illustrative of the audit method we apply to client sites.
Let's build the
next thing.
Tell us where you're going. We'll show you the system that gets you there, and the team that's been doing it since 1996.
Got it, thank you!
We will be in touch shortly.