How to make decisions when your data signal is not “clean”?
The second dock of the series "Taking back control in a more complex eCommerce world".
In this article, let's figuring out how to make decisions when you cannot fully trust the data.

In the previous article of our series on "Taking back control in a complex eCommerce world," we talked about how AI is reshaping where and how buying decisions happen, and why store owners and eCommerce operators should focus not just on scaling traffic, but on optimizing the decision-making moments as well.
But here is the next problem: every optimization you make depends on data. What if that data is compromised?
If you have worked in eCommerce long enough, you have probably lived through some of these:
You redesigned a product page based on engagement data, then wondered if half those sessions were even real shoppers.
You scaled a campaign because ROAS looked strong, then noticed the traffic source was flooded with bot sessions.
You sent cart recovery emails to hundreds of abandoned checkouts. Half of them bounced. The emails were fake.
You ran an A/B test and picked a winner. But you have no idea how many bots were served both variants.
The reports feel "off," but you cannot prove why. Someone says, "It's a bot issue." Unless you are at a large enterprise with a strong technical team, that issue probably never gets fixed completely.
No one owns the problem. The responsibility is still on you. And you still need to make decisions.
So, how do you move forward with data you cannot fully trust?
This article is for non-technical operators, marketers, sales teams, and eCommerce managers. We are not going to give you any Cloudflare bot prevention setup or hand you a step-by-step filter creation guide.
You already know bots exist. You already know your data is rarely 100% clean. We are not re-explaining that as well.
Instead, we are solving a different problem: how to make decisions when the lines are blurred. Which data is trustworthy enough to act on? Which needs skepticism? And how do you build a practical framework so imperfect numbers do not paralyze you?
By the end, you will have a clearer view of how bot traffic distorts eCommerce decisions, and a practical way to keep making confident moves despite it.
In this article, we also asked for insights from eCommerce expert Nahar Geva. Nahar has been an eCommerce entrepreneur since 2015 and is the Founder and CEO of ZIK Analytics.
His hands-on experience in the industry eventually led him to build one of the most recognized product research platforms in eCommerce.
Today, ZIK Analytics has empowered over 300,000 sellers worldwide. Beyond the platform, Nahar is also a respected speaker and mentor who is deeply passionate about helping everyday people achieve financial independence through eCommerce, regardless of their background or experience level.
1. Light “Bots 101” and market changes
Before we get to the practical framework, you need a quick foundation: what types of bots exist, and which ones mess with your data.
There are three categories of bots, sorted by intent:
Category | Examples | What they do |
|---|---|---|
Beneficial | Search engine indexers, accessibility checkers, and authorized integrations | Help your store get discovered and function properly. You want these. |
Undersirable | Price scrapers, restock monitors, unknown crawlers, unauthorized load testers | Not directly harmful, but pollutes your data and wastes resources. Gray area. |
Harmful | DDoS attacks, payment fraud bots, account takeover bots, counterfeit scrapers, fake signup bots | Actively damage your business, steal data, or commit fraud. |
In less technical terms:
The helpful guests: Search engines (like Google) that help people find you. You want these.
The rude neighbors: Competitors scraping your prices or inventory. They are not stealing money, but they are clogging your data and wasting your resources.
The intruders: Fraudsters trying to steal accounts or test stolen credit cards. These are the ones that cause real financial damage.
💡Fun fact: Do you know how CAPTCHA tells humans apart from bots?
Humans are messy. Bots are precise. That is exactly what CAPTCHA is built on.
We hesitate, we scroll back up, and we click 'the wrong way.' Bots are perfect—they spend exactly 11.5 seconds on a page, every single time. If you see a cluster of visitors with 'perfectly identical' behavior, you aren't looking at a group of similar shoppers; you're looking at a script.
But this is also why sophisticated bots are harder to catch, since they are learning to add randomness.
But what about AI crawlers? Do they behave like shoppers?
Short answer: no.
Legitimate AI crawlers identify themselves via user agent, fetch content, and follow robots.txt rules. They do not simulate shopping behavior. They behave like classic search engine crawlers, not shoppers.
So do not worry about legitimate AI crawlers adding fake shopping actions to your store.
The AI-related threat worth watching is different: AI-assisted bots built by bad actors that use AI to mimic human behavior. We will cover those in Section 3.
The quick summary:
Good bots: search engines, legitimate AI crawlers, monitoring tools. You want them.
Bad bots: data scrapers, scalpers, payment fraud bots, account takeover bots, fake signups, and bots that abuse your checkout, authentication, and data endpoints.
You do not need to identify every bot. You do not need to prevent all of them. That is the IT team's job.

Nahar Geva:
Bot traffic has always been around, but when it comes to real conversions, it honestly doesn't matter all that much. If you're focused on the right data, bot traffic becomes irrelevant. Bots can't actually buy anything from you. Sure, they can inflate your visitor counts and view numbers, but at the end of the day, "Add to Cart," "Initiate Checkout," and "Purchase" events are only triggered by real people. So if you're optimizing around purchases and ROAS, you're already cutting through the noise and the bot traffic isn't a problem anymore.
2. Diagnose the impact: When / Where / Which metrics
Some of you might want to isolate the impact of bot traffic completely. For most eCommerce businesses, the honest answer is: you cannot.
But that does not mean being passive. It means learning to make decisions whether your data is clean or dirty.
According to DataDome's 2024 Global Bot Security Report, only about 10% of eCommerce websites are fully protected against bad bots. Over 65% have no protection against even simple bot attacks. If you are running an online store, bot traffic is almost certainly already in your data.
So the smartest move is not to chase perfectly clean data. It is to diagnose the impact: when does bot traffic spike, where does it show up, and which metrics does it distort?
2.1 When does it spike?
There is no fixed "attacking season." But bot traffic tends to spike around major shopping events such as Black Friday, holiday sales, and other high-traffic periods.
The reason is simple. During these periods, everyone is active: entrepreneurs are using tools to research markets, monitor competitors, and track trends. Shoppers are flooding in. And attackers take advantage of that volume to blend in.
That said, attacks are year-round now, not just seasonal. Peak shopping events are simply the most sensitive times, when higher traffic gives bots more cover.
Be extra careful with your metrics around these periods. A number that looks like growth might just be noise.
2.2 Where and which data is affected?
Not all your metrics are equally vulnerable. Here are the ones that tend to get distorted, mapped along the customer journey:
Sessions by traffic source. Bot traffic often clusters under "Direct" or "None," inflating that channel disproportionately.
Product page views and engagement. Fake visits make it harder to know which products genuinely interest customers.
Add-to-cart rate. Bots simulating shopping behavior inflate this metric, making your funnel look healthier than it is.
Abandoned checkout rate. Bot-initiated checkouts that never complete make this number unreliable.
Email capture and signup forms. Fake submissions pollute your list and trigger automation workflows that cost you money.
Campaign attribution. If bots arrive through your ad links, your ROAS calculations will be wrong.
The key point: some of these will be clean, some will be noisy. They are not all distorted equally.

Example of the Purchase journey view in Google Analytics
Nahar Geva:
Inflated views, visitor counts, and clicks paired with low conversion rates are a sign of bot traffic. Of course, in some cases the bot traffic isn't significant and it's a lot harder to determine whether it's bot or human, but what matters at the end of the day is the buying activity. So if a campaign is performing well and generating a positive ROAS, we can completely ignore the other "traffic" data.
So how can you tell whether bots are actually affecting a specific metric? Here is how I would approach that question:
If you have a strong IT team, they can filter the data and give you clean numbers. That is the ideal scenario.
If you do not, compare against your own benchmarks. Here is what to look for:
A spike in sessions? Break it down. Check whether the bounce rate stays in its normal range, or if it has shifted to something ridiculously low or high.
A spike in product page views? Look at engagement time. Remember: humans are messy, bots are precise. If the engagement time is nearly identical across sessions, same duration, same patterns, then there is a strong chance those are bots.
A spike in abandoned checkouts? Watch the session replays. Real shoppers drop off at different steps, at different times. Bots drop off at the same step, at the same time.
The pattern is the same every time: compare the anomaly against what you know is normal for your store. Your experience and knowledge of your business are legitimate diagnostic tools.
And one more thing worth remembering: if the decisions you need to make right now do not involve the distorted metrics, you can breathe. Not every noisy number demands immediate action.
You are not saying "ignore everything." You are just saying "interpret carefully."
⚠️ Worth knowing: The analytics blind spot
Your analytics dashboard only sees visitors who load JavaScript and accept cookies. Think of it as guests who walk through the front door and sign the guestbook.
Your server sees everything. Every request, every knock on the door, including bots that bypass your tracking pixels entirely.
This is why your hosting logs might show a traffic storm while your dashboard looks perfectly calm. If you notice a gap between the two, it is a visibility issue, and most likely a sign that something automated is hitting your store under the radar.
3. A data trust hierarchy for your metrics
You have done the diagnosis. You know when bot traffic spikes. You know which metrics are most likely distorted. You have a sense of the blind spots in your analytics tools.
Now you are sitting in front of your dashboard. The question has not changed: which of these numbers can I actually use?
This is where most operators get stuck. Not because they lack data but because they have no way to sort trustworthy signals from unreliable ones.
Most resources do not help, either. They collapse all bot traffic into one bucket, treating every non-human visit as the same threat. That is misleading.
Different bots create different distortions. Different distortions affect different parts of your data. Once you see that, the question shifts:
You stop asking "Is my data wrong?" and start asking something more useful: "Is this data good enough for this specific decision?"
That is what we will build in this section.
3.1 Different bots, different distortion
Earlier, we introduced three categories of bots by intent: beneficial, undesirable, and harmful.
Here is how each one distorts your data differently.
Category | Who runs them | Purpose | Traffic scale |
|---|---|---|---|
AI platform crawlers | OpenAI, Google, Anthropic | Indexing, model training | Moderate |
SEO / data bots | Agencies, vendors, competitors | Scraping, price monitoring, rank tracking | High |
Attack bots | Hackers | Fraud, abuse, account takeover | Very high |
AI-assisted bots | Anyone | Anything, from scraping to fraud | Explosive growth |

To be more specific:
AI platform crawlers are indexing and training their models. They typically identify themselves through their user agent, follow robots.txt rules, and do not simulate shopping behavior. They fetch your page content and move on. The data distortion they create is mostly limited to inflated page views on content and product pages. If you see a spike in views but no corresponding change in engagement or add-to-cart behavior, this is likely what you are seeing.
SEO and data bots are scraping your site for pricing, inventory levels, keyword rankings, and product information. These are the ones that crawl your product pages repeatedly, and they are a major source of inflated engagement metrics. They make it harder to tell which products are genuinely attracting customer interest versus which ones are simply being monitored by your competitors.
Attack bots are there to abuse your systems. Payment fraud, account takeovers, fake signups, and checkout exploitation. These are the bots that distort your checkout and transaction data most heavily. They inflate abandoned cart rates, create failed transaction noise, and pollute your email lists with fake addresses.
AI-assisted bots are the newest and fastest-growing category. They can be run by anyone, for any purpose, and they are powered by the same large language models that make AI tools so capable. What makes them different is that they are learning to behave more like humans, adding randomness to their timing, rotating IP addresses, and mimicking real browsing patterns. Their growth is explosive, and they are the hardest to detect because they do not follow the predictable patterns that traditional bot detection relies on.
Practical takeaway: the distortion tells you the source
Pageview spikes with no conversion changes? Likely beneficial crawlers or undesirable SEO bots. The distortion is in your traffic volume and page view metrics.
Add-to-cart noise, checkout abandonment spikes, failed payments? Not AI platforms. These point to harmful bots targeting your transaction funnel.
Human-like patterns with suspiciously consistent timing? AI-assisted attackers. The distortion blends into your real customer data, making it the hardest to isolate.
Different bots. Different distortions. And as we will see next, different levels of trust in the data they touch.
⏰Quick break with some interesting updates on AI bots:
AI bots are becoming a major source of web traffic (the State of Bots Report of Q3 & Q4 2025 by Toiibit)
Automated traffic has now surpassed human traffic, largely driven by AI and LLM-powered bots — with malicious bots accounting for 37% of total traffic (Imperva report)
DataDome’s Report (Q4 2025) highlights the least protected industries against bad bots and unwanted AI traffic: telecom, technology/software, and gaming.
3.2 The data trust model
This model sorts your eCommerce metrics into three levels, based on one question: how hard is it for bots to fake this data?
Each level maps to the types of decisions it can reliably support.
High trust signals
Completed orders. Revenue. Refund rates. Payment data.
These are the hardest metrics for bots to manipulate. Because they require real money to change hands. A bot can inflate your page views or spam your signup form. It cannot generate a legitimate completed purchase with a valid payment method.
These are your foundation. When you are making decisions about revenue performance, channel ROI, inventory restocking, or budget allocation — lean on these numbers.
Example: A product seems to be trending. Do not base that call on page views alone. Look at completed orders. If the orders are strong, the trend is real, regardless of how inflated the page views might be.
Medium trust signals
Intermediate checkout actions, such as reaching the shipping page, entering an email, and selecting a payment method. Logged-in customer actions. Post-purchase behavior, like repeat purchases, review submissions, and support interactions.
These involve some level of authentication or commitment, which makes them harder for basic bots to generate. But we believe that nowadays, sophisticated bots can still distort them, especially checkout steps and login activity.
Use medium trust signals for funnel optimization and customer segmentation. But treat them with skepticism when the stakes are high.
Example: You are testing a new checkout flow. Medium trust data can guide you, but cross-reference with completed orders and revenue before committing to a permanent change.
💡 Expert tip: Use checkout validation to reduce funnel noise
If your store uses multi-page checkout, adding validation rules is a practical way to reduce fake or incomplete checkout actions from reaching deeper funnel steps.
Through checkout customization apps on Shopify, you can set requirements around cart contents, order value, and contact or shipping information before a session advances. This does not eliminate bot activity entirely, but it raises the bar enough to make your medium trust signals meaningfully cleaner.
Moreover, checkout validations also help filter out unqualified orders and reduce the need to follow up with customers for missing or incorrect information.

Example of the checkout validations from Qikify Checkout Customizer app
Low trust signals
Sessions. Anonymous page views. Add-to-cart events. Email signups. Engagement metrics like time on page and scroll depth.
These are the easiest for bots to generate and the hardest to verify.
A single scraping bot can create hundreds of sessions. A form spam bot can add thousands of fake emails to your list overnight. A competitor's price monitoring tool can make a product look wildly popular based on page views alone.
Use low trust signals directionally only. They tell you something is happening, but not what or why with enough confidence to justify a major decision.
Three rules to remember:
Never increase ad spend based on a traffic spike without checking whether conversions follow.
Never feature a product on your homepage based on page views without confirming actual purchase behavior.
Never trust email signup numbers at face value without monitoring bounce rates and engagement after the first send.
A quick reference for the model:
Trust level | Signals | Use for | Example decision |
|---|---|---|---|
High trust | Completed orders, revenue, refund rate, payment data | Revenue decisions, channel ROI, restocking, and budget allocation | "Should we reorder this product?" Look at completed orders, not page views. |
Medium trust | Intermediate checkout actions, logged in actions, post-purchase behavior | Funnel optimization, customer segmentation (with caveats) | "Is our new checkout flow better?" Check completion rates, then verify with actual revenue. |
Low trust | Sessions, page views, add to cart, email signups, engagement metrics | Directional signals only. Cross-reference before acting. | "Is this product trending?" Page views say yes, but check orders before featuring it. |
The goal is not to make you paranoid about your data. It is to give you a consistent way to evaluate how much weight a number deserves before you act on it.
Think of it like lead scoring. Before you approach your best prospects, you evaluate and rank your leads. This is the same step applied to your data before making decisions.
Here is what it looks like in practice: Someone on your team says, "Our traffic is up 30% this month. We should increase the budget and scale up."
You do not need to launch a bot investigation. You just ask one question: Is traffic a decision-ready metric or a verification-required one?
It is verification-required. So before anyone adjusts a budget, pair it with revenue or completed orders. If the growth shows up there too, act on it. If it does not, then dig deeper before committing.
4. What Shopify is doing behind the scenes (and the gaps)
This article is about making better decisions with imperfect data — not handing you a technical fix.
But it helps to know what your platform is already doing in the background. That context changes how you read your numbers.
If you are on Shopify, three recent changes are worth knowing:
"Human or Bot Session" dimension: A new report filter that lets you separate real customer sessions from automated traffic.
Robots.txt boundaries: New default rules that block autonomous AI agents from completing payments without human review.
Shopify Flow automation: Custom rules to hold suspicious orders or trigger alerts for unusual activity.
You do not need to understand the technical details. What matters is knowing that these exist and how they relate to how you evaluate your data.
The bot filtering update
The most relevant change for operators is the "Human or Bot Session" dimension, introduced October 7, 2025. It lets you filter your Shopify analytics to separate real customer sessions from automated traffic, directly inside your dashboard.
You can now:
See your actual conversion rate by removing non-converting bot sessions.
Compare traffic sources to identify which channels bring real shoppers versus which are inflated by bots.
Use the filter across all standard session-related metrics such as sessions, conversion rates, and visitor counts.

How Shopify defines human and bot activities
Shopify has also announced plans to make this available for headless and Hydrogen stores. To learn how to apply the filter, check Shopify's help center article on bot filtering.
There are limitations worth knowing. The filter only applies to session-related metrics and only covers data from October 7, 2025, onward. It takes 24 to 48 hours to verify a session. Your data will look 'dirty' in real-time and 'clean' two days later. Important: This filter fixes your reports, but it doesn't stop bots from firing your Meta or Google pixels at the moment. Your ad algorithms are still 'seeing' the bots before Shopify has a chance to flag them.
The biggest gap for operators running paid campaigns:
The filter does not prevent bots from firing your marketing pixels in real time. Your ad algorithms on Meta and Google learn from every signal they receive, including the ones generated by bots. And based on discussions across Reddit and the Shopify community, most stores encounter bot traffic problems, especially during paid ad promotions, when higher traffic volumes give bots more cover to blend in.
If you have faced these issues with paid campaigns, I suggest spending time creating advanced targeting and exclusion rules for your campaigns. You cannot completely prevent bots, but you can absolutely tighten your filters and save your budget.
Third-party tools for deeper protection and analysis
Shopify's native tools are a meaningful step forward. But they will not solve everything. And there are plenty of complaints on community channels about whether the filter catches enough, even on the Shopify Plus plan.
If that becomes a problem, third-party tools can help. They fall into two categories:
Protection apps: built to block bad traffic before it reaches your data. Popular options: Dissable Right Click & NoSpy (store protections using spy, bots, VPNs, etc.), Blockify (broader store protection with IP, country, and VPN blocking).
Advanced analytics platforms: like PostHog and Mixpanel. These integrate with Shopify and offer session replays, custom event tracking, funnel analysis, and user segmentation. They do not block bots, but they let you dig into your data, spot suspicious patterns, and cross-reference signals across trust levels. Both offer free tiers for smaller stores.

PostHog's configuration to filter bot events
The difference between Shopify plans
Basic and Standard plans give you the pre-built reports and the "Human or Bot" session filter. For most operators applying the data trust model, this covers the essentials.
Shopify Plus adds ShopifyQL Notebooks, letting technical users or agencies run advanced queries to identify complex bot patterns and build custom reports.
Across all Shopify plans, Shopify has updated its default storefront rules to block autonomous AI agents from completing purchases without human review. This protects your transaction and checkout data, the decision-ready metrics at the top of the trust model. No configuration needed. It is handled at the platform level.
For most merchants on standard plans, the built-in filter plus the trust model from Section 3 will get you a long way.
Nahar Geva's final advice to all eCommerce merchants:
As eCommerce sellers, we take decisions based on data in many different areas of the business. But I'd say the most significant data-driven decision happens right at the first step, when a seller decides what they want to sell. A lot of new sellers rush because they want to get started fast, and they fall into the false belief that the first product they thought of is their golden ticket. Unfortunately, the reality is that the average seller needs to test around 10 different products before finding a winner, and that's exactly where most people fail and give up.
Like any opportunity in life, we need to learn to distinguish a great opportunity from a bad one. What does that mean? Simply put, consistently keep looking for a winning product that checks all the criteria, without compromising on anything, even if it means spending a week searching before you even get started. On top of that, don't rely on just one product. We can't guarantee that our first product will perform well, so we need to be ready to test multiple products.
To do this right, sellers must use an analytics tool that provides sales history, traffic data, and ads data and combine all those data points with some common sense to make their picks.
It doesn't matter what marketplace you're selling on, whether you have your own Shopify store, whether you're an eBay seller, a dropshipper, a brand owner, or an inventory seller, you need to use data to decide what to sell. That's how you increase your chances of finding winning products that generate real revenue and make you a successful eCommerce seller. The leading product research tool for eCommerce sellers and dropshippers as of today is ZIK Analytics.
5. End with confidence
That was a lot of ground to cover. Let's do a quick recap so you can walk away with the full picture:
Bots are already in your data, and that is okay. Not all bots mess with the same things. Once you know which type is affecting which metric, you are already in a much better position to respond.
Diagnose before you react and have a data trust model to use for your business. When in doubt, cross-reference before committing to budget, inventory, or strategy.
Your platform helps, but it has gaps. Third-party tools can fill the gaps if you need them.
Now, the one mindset shift that ties it all together:
Stop asking "Is my data clean?"
Start asking "Is this data reliable enough for this specific decision?"
You do not need to block every bot. You do not need perfect analytics. You just need to know which numbers matter for the decision in front of you — and whether they are trustworthy enough to act on.
That is what control looks like for operators.
This is the first dock of our 7-part series, "Taking Back Control in a More Complex Ecommerce World."
Next in the series: we are heading to where all of this converges — the checkout. That is where your data becomes revenue, your customer experience gets tested, and the gap between your dashboard and reality becomes real. See you there.


