The Hidden Traffic in Your Search Console: Detecting Synthetic Queries
The problem: What are synthetic queries?
Synthetic queries are search terms generated artificially rather than from real users typing into Google.
I bet you saw unusually long search queries in Google Search Console like comprehensive analysis of sustainable marketing strategies B2B SaaS companies 2025 or detailed comparison between traditional SEO approaches and modern content optimization techniques 2025. They don’t quite sound like how real users typically search since these are synthetic queries.
AI tools and chatbots represent one major source of synthetic queries. When you ask ChatGPT, Claude, or Perplexity a question, these tools modify and optimize queries before searching the web like this:

In addition to AI tools, synthetic queries also come from:
- SEO monitoring tools that automatically check rankings in tools like Peec.ai, aiscope.pro, etc.
- Research bots that collect data for analysis
- Automated testing systems that verify website functionality
- Competitive intelligence tools tracking market information
- Quality assurance systems testing search features.
The key differences between human and synthetic search patterns are clear:
| Aspect | Human users | Synthetic queries |
| Query length | Shorter, conversational queries | Longer and more structured |
| Language style | Natural language with typos and informal expressions | Robotic, goes as a list of keywords |
| Search intent | Immediate specific needs | Comprehensive information systematically |
| Browsing patterns | Unpredictable browsing patterns | Predictable, automated patterns |
Why synthetic queries matter for your marketing analytics
Synthetic queries may trigger interactions that look like real user engagement, but they don’t represent actual potential customers or genuine interest in your products or services. These could be testing tools clicking through to your site, monitoring systems triggering page views, and research bots interacting with your content. None of them represents real commercial intent. As a result, you may misinterpret your Search Console data in terms of conversion tracking and user behavior analysis.
For marketing SEO dashboards, this means the performance metrics might include traffic that doesn’t represent your true audience. Synthetic queries can create noise in your data that makes it harder to identify what’s actually working with real users.
How to spot synthetic queries in your Search Console data
In Google Search Console, you can do a quick manual check by filtering your queries using regular expression. The most obvious indicator of an artificially generated query is its length. Synthetic queries are typically much longer than typical user searches, often containing 6 or more words. Here is a regex you can use for this:
^\s*\S+(\s+\S+){5,}\s*$

The vocabulary is usually more sophisticated, with industry jargon and technical terms that regular users might not commonly search for. Here are a few regex strings that might be useful to detect synthetic queries:
With formal connector words
.*(comprehensive|detailed|effective|optimal|strategic|systematic|implementation).*

With multiple technical terms in one query
.*(strategy|optimization|analysis|performance|implementation).*\s.*(digital|marketing|SEO|technical|advanced).*

Questions that are too structured for typical users
^(how to|what are the|what is the best way to).{30,}$

While these manual methods work for spot-checking your data, they quickly become impractical for ongoing monitoring. No automation, no historical tracking, time-consuming, and so on. Instead, you can create an automated system to identify potential synthetic queries for searches with 6 or more words. With just a few clicks!
Automated synthetic queries detection system
The logic of this automated detector consists of two data flows built with Coupler.io:
- Your Google Search Console data is extracted to BigQuery and gets refreshed on a schedule.
- Another data flow queries that BigQuery data set with a filter to identify long queries and sends the results to a spreadsheet for easy analysis
Each step is pretty easy to set up, and here are the instructions for this:
Step 1: Google Search Console to BigQuery
Start by setting up your first data in Coupler.io. Use the preset form below and just click Proceed.
You’ll be offered to sign up for Coupler.io for free with no credit card required.
Then connect your Google Search Console account, select the websites and Search results performance as a report type.
You’ll also need to specify the report period for your data, as well as add Query as a dimension.

Proceed to the next step, where you’ll get a preview of your Search Console data. Here you can hide unnecessary fields to keep only essential metrics and dimensions like query, page URL, clicks, and impressions. No other transformations are needed here.

Then move to the destination setup and connect your BigQuery project following the in-app instructions. Additionally, feel free to check out our blog post about how to connect Google Search Console to BigQuery.
Once the data flow is set up, run it to load data.
Step 2: Export SQL query from BigQuery
In Coupler.io, you’ll need to create a new data flow from scratch and choose BigQuery as a data source. Connect your project and enter the SQL query that applies our synthetic query detection logic.

Here’s the SQL query that does the heavy lifting:
SELECT
query as keyword,
page as url,
SUM(clicks) as sum_clicks,
SUM(impressions) as sum_impressions
FROM `{project.dataset.table}`//enter your your project, dataset, and table to query data from
WHERE ARRAY_LENGTH(SPLIT(REGEXP_REPLACE(TRIM(query), r'\s+', ' '), ' ')) >= 6
GROUP BY 1,2
ORDER BY 3 desc, 4 desc
This query works by splitting each search query into individual words and counting them. The key part is the ARRAY_LENGTH(SPLIT(...)) function that counts words after cleaning up extra spaces. We’re filtering for queries with 6 or more words, which effectively captures most synthetic queries from various sources while avoiding false positives from shorter user searches.
Note: Do not forget to use your details in the FROM statement.
At the Transformations step, you’ll see a preview of the selection of your synthetic queries.

Now you can load this report to Google Sheets, Excel, or another destination supported by Coupler.io. To make this reporting solution automated, set up a schedule to refresh data from Search Console. Then, use webhooks to trigger the second data flow automatically when the first one is run.
This creates a seamless workflow: the first data flow runs on schedule, pulls new Search Console data to BigQuery, and immediately triggers the data flow to process and filter that data into your report. You get fresh synthetic query analysis without manual intervention.
The automated detection system I’ve built gives you ongoing visibility into this hidden traffic in your Search Console data. With just a few minutes of setup in Coupler.io, you can separate artificially generated searches from real user queries and get a clearer picture of your true organic performance.
Create automated data flows with Coupler.io
Get started for freeWhy can’t you omit BigQuery for detecting synthetic queries in Search Console?
With artificial intelligence playing a significant role in today’s workflow optimization, why can’t you just integrate Search Console data to Claude or ChatGPT? Especially if Coupler.io provides AI integrations for this purpose.
The answer is straightforward: while AI tools excel at conversational data analysis, they’re fundamentally designed for different use cases than systematic query detection. You will get answers to questions like “What are my top performing keywords this month?” or “Analyze the trend in my organic traffic.” It’s built for exploratory analysis and getting AI-powered insights from your data through natural language conversations.
Synthetic query detection requires automation: Our use case needs scheduled, automated filtering that runs continuously in the background. We want to automatically identify and track synthetic queries over time, build historical datasets, and integrate this data into our regular reporting workflows. This is systematic data processing, not conversational analysis.
Scale and performance limitations: While Claude or ChatGPT can analyze data, they’re not designed to process thousands of search queries with complex regex patterns and array functions repeatedly. The AI models work best with focused datasets and specific questions, not bulk data processing operations.
BigQuery’s specialized capabilities: The synthetic query detection logic requires sophisticated string manipulation—splitting queries by spaces, counting array elements, and applying multiple filters simultaneously. BigQuery’s SQL engine handles these operations efficiently with functions like ARRAY_LENGTH(SPLIT(REGEXP_REPLACE(...))) that aren’t available through conversational AI interfaces.
Ongoing monitoring needs: We need a system that can run scheduled updates, maintain data consistency, and reliably process new Search Console data as it arrives. BigQuery provides the infrastructure for this kind of automated, ongoing operation.
Think of it this way: an AI tool is your data analyst for insights and exploration, while BigQuery is your data engineer for systematic processing and automation. For synthetic query detection, we need the engineer.
Analyzing your results and next steps
Once your detection system is running, you’ll start seeing patterns in your data. Look for clusters of similar long-form queries that might indicate automated tools researching related information. You might notice certain pages on your site attracting more synthetic query traffic than others. Quite often, these are comprehensive guides, technical documentation, API references, or detailed resource pages that automated systems find valuable.
Use this information to understand which of your content performs well for automated searches versus human searches. While synthetic queries don’t convert like human traffic, they can indicate different things:
- That your content is authoritative enough for research tools to reference.
- That your site is being monitored by competitors.
- That testing systems are validating your search presence.
For ongoing monitoring, check your synthetic query report regularly to spot trends and changes in automated search behavior. You might want to create separate tracking for synthetic query performance in your existing SEO dashboards, allowing you to analyze human and automated traffic separately.
Ready to build your synthetic query detection system? Set up your Coupler.io account for free.