With Google BigQuery, you can focus on what matters instead of managing your physical infrastructure. It is essential to manage your cost when you use BigQuery by checking the price of the service that will fulfill your needs. Understanding the cost of Google BigQuery is vital if your company wants to effectively use this data warehouse service.
This article explains BigQuery costs in detail, allowing you to adapt your data budget to your company’s demands and maximize your BigQuery investment.
What are the factors that affect Google BigQuery pricing?
Two major variables affect the cost to the end-user.
- Analysis pricing: This involves expenses incurred for executing SQL commands, user-defined functions, Data Manipulation Language (DML), and Data Definition Language (DDL) statements in BigQuery tables.
- Storage pricing: The amount of data stored in BigQuery determines the cost of storage.
Effect of Storage Cost on Google BigQuery Pricing
- Charges are usually incurred monthly for data stored in BigQuery tables or partitions that are active – meaning that they have been modified in the last 90 days.
- If you haven’t made any changes to your BigQuery tables or partitions in the last 90 days, you may be eligible for a fee reduction of up to 50%.
- Charges may apply depending on the amount of incoming data when utilizing the BigQuery storage APIs.
- Users pay for each 200MB of streaming data consumed by Google BigQuery.
Effect of Query Cost on Google BigQuery Pricing
- With on-demand pricing, you are charged based on the amount of data processed by your queries. You are not charged for failed queries or queries loaded from the cache. Besides, the first 1 TB of query data processed per month is free. Additionally, prices vary by region.
- With flat-rate pricing, you pay a fixed charge regardless of the amount of data scanned by your queries. This price option is ideal for clients that need a predictable monthly cost within a specific budget. Users must purchase BigQuery slots to take advantage of flat-rate pricing—more on that soon.
Google BigQuery storage cost
Storage cost refers to the cost of storing data in BigQuery. You pay for both active and long-term storage.
- Active storage: Any table or partition of a table that has been updated in the past 90 days is considered active storage. At the moment, BigQuery charges a fixed monthly fee of $0.02 per GiB per month for active logical storage. The active physical storage costs $0.04 per GiB per month. The first 10 GiB is free each month.
- Long-term storage: Any table or partition of a table that has not been updated in the last 90 days is considered long-term storage. After 90 days, the price of storage data decreases by 50%. The price for long-term logical storage is $0.01 per GiB per month. Long-term physical storage will cost more – $0.02 per GiB per month. The first 10 GiB is free each month.
Active and long-term storage are equivalent in terms of performance, durability, and availability.
Cost of storing data in BigQuery
You’re charged based on the amount of data you put into BigQuery. To determine the total amount of your data, you must know how many bytes each column’s data type contains. This is the size of BigQuery data types:
BigQuery cost per 1 GiB
|Type of storage||Price||Free tier|
|Active logical storage||$0.02 per GiB||The first 10 GiB are free each month|
|Active physical storage||$0.04 per GiB||The first 10 GiB are free each month|
|Long-term logical storage||$0.01 per GiB||The first 10 GiB are free each month|
|Long-term physical storage||$0.02 per GiB||The first 10 GiB are free each month|
It costs $0.02 per GiB per month for BigQuery to keep your data in active storage. Thus, if we keep a 200GiB table for one month, the cost will be (200 x 0.02) = $4.
Note: With the free 10GiB every month, a user will get a total of 210GiB for $4.
When it comes to long-term storage, the cost is much lower than with active storage. For example, long-term storage of a 200GB table for one month will cost (200 x 0.01) = $2. If the table is updated, it becomes active storage and the 90-day period resets and starts from the beginning again.
Also, it is important to note that the price of storage varies by location. For example, selecting Mumbai (
asia-south1) as a storage location costs $0.023 per GiB, while using the US (
multi-regional) (us) or EU (
europe) costs $0.02 per GiB.
BigQuery cost per 1 TB
The size of your saved data and the data processed by your queries is measured in gibibytes (GiB). 1 GiB is 230 bytes or 1,024 MiB. 1 TiB (tebibyte) is 240 bytes or 1,024 GiB.
If a GiB of storage costs $0.02 and 1TB is approximately 1,000 GiB (931.323) then 1TB costs $20.
Note: The cost of data changes from location to location. For active storage, the cost of the data is $0.02 and for long-term storage, it is $0.01.
To get the cost of 5TB, we will simply multiply the data amount by 1,000 (to convert to GiB), then multiply the result by $0.02 per GB.
5 TB * 1000 = 5,000 GiB
5,000 GiB * $0.02 = $100
Query price analysis structure in Google BigQuery
We use analytical pricing in BigQuery to calculate the cost to perform queries, including SQL queries, user-defined functions, and scripts, as well as storage pricing to calculate the cost to store data that you load into BigQuery.
BigQuery has two distinct price levels for its users to select from when executing queries. The price levels are:
- On-demand pricing: In on-demand pricing, you pay based on the size of each query and the number of bytes handled by each query. If the query fails, you will not be charged. The first terabyte of query data processed each month is provided free of charge to all users.
- Flat-rate pricing: Under the flat-rate pricing approach, you pay a set fee regardless of how much data your queries scan. This is the best pricing choice for users who want a consistent monthly fee within a set spending limit.
Users may access flat-rate pricing by purchasing BigQuery slots, essentially virtual CPUs used by BigQuery to execute SQL queries. The dedicated slot capacity you buy determines the amount of processing power reserved for all of your queries at any given time, rather than for each query separately. If your requests exceed your dedicated capacity, BigQuery queues individual work units and waits for slots to become available.
As query processing progresses and slots become available, queued work units are dynamically selected for execution, and no additional fees are charged.
Slots are used in both the on-demand and flat-rate pricing, but the flat-rate approach offers you specific control over slots and analytics capacity; for example, in flat-rate pricing, you can choose to reserve slots for:
- 60 seconds: Flex slots
- Monthly: 30 days
- Yearly: 365 days
Location is important to consider as well. For example, a monthly cost of $2,000 will give you 100 slots when purchasing from the EU (
multi-region), but in London (
europe-west2) 100 slots will cost a monthly fee of $2,500. So the number of slots and cost on either of the plans are determined by location.
You can always mix and match the two models to meet your specific requirements. You pay for what you consume with on-demand pricing while with flat-rate pricing, you get an assured capacity at a reduced cost in exchange for a longer-term plan.
BigQuery cost analysis
Now that you know how much specific BigQuery operations will cost depending on your needs, the next step would be to estimate your expenses for those activities. BigQuery cost analysis begins with calculating your query and storage cost to determine your overall expenses.
How to check Google Bigquery query cost
You may use one of the following techniques as a BigQuery cost estimator to evaluate expenses before executing a query:
- Use the query validator in the cloud console: The query validator is located in the BigQuery console and it shows how many bytes of queries it will process.
- Use the
—dry runoption in the BigQuery command-line tool: You can use the
—dry run flagto estimate the number of bytes read when using the bq command-line tool. You can use
—dry runwhen working with API or client libraries. To run, use the
—dry run flagwith the bq command-line tool.
bq query \ --use_legacy_sql=false \ --dry_run \ 'SELECT COUNTRY, STATE FROM `project_id`.dataset.shipping LIMIT 1000'
- Use the Google Cloud Pricing (GCP) Calculator: Google offers a price calculator to help you estimate how much money you’ll spend on the resources you might need.
How to estimate using the BigQuery cost calculator
The following steps outline how to estimate your Google BigQuery costs using the GCP pricing calculator for clients with the on-demand pricing model:
- Go to the main page of your BigQuery console.
- When you enter a query, the query validator (the green tick) verifies it and estimates how many bytes it will process.
As you can see, this query will use approximately 514.4 GiB.
- The next step is to access the GCP Pricing Calculator.
- Choose BigQuery as your product and on-demand pricing as your pricing method.
- Complete the on-screen form with all the necessary information, as shown in the picture below.
Because we haven’t used up our 1TB free tier for the month yet, the charges to execute our query of 514.4GiB are nothing. However, if you need to run this query every day for the next month, you will use 514.4 * 30 = 15,432 GiB; this will quickly put us about the 1TB free tier.
Our total charge now will be $70.35 per month.
In flat-rate pricing, you are not billed for bytes processed; rather the queries you run are billed based on your purchased slot. In the example below, we bought 100 slots at the estimated cost of $2,000. Within this limit, we will process the 514.4GiB-worth of queries at no additional charge.
If you send a heavier load of queries, let’s say 1,000 GiB, with the same capacity of 100 slots, the pricing won’t change. You will still only pay $2,000 for the slots you have purchased.
The query processing may take a bit longer, though. If the available slots are insufficient to process a given query, the query will be queued and processed later when slots are available.
The slot capacity you’ve purchased determines the maximum capacity split among all your queries, rather than the processing power for each query separately.
The estimated cost offered by the tools above may differ from the actual expenses for the following reasons:
- A query clause that performs data filtering, like a
WHEREclause, may substantially decrease the amount of data read.
- After the estimate is given, additional or deleted data may increase or reduce the number of bytes read when the query is performed.
How to estimate BigQuery storage and query costs?
To calculate BigQuery storage and query cost, we start by collecting the necessary information to estimate our costs:
- Number of Users
- Number of Queries
- Average Data Usage
As an example, imagine we have a dataset that we imported into BigQuery from Xero or Hubspot using Coupler.io. It’s used by 10 users per day, each running five queries per day, with an average data usage of 2 GiB per query. We calculate the cost per month, which we’ll assume has 30 days.
We can then take those parameters and apply a basic calculation to estimate our average monthly cost with BigQuery.
MONTHLY QUERY DATA USED = 10 * 5 * 2GB * 30 = 3,000 GiB = 3TB
To calculate the BigQuery storage price, with query data of 3TB per month, as of the time of writing 1TB is around $20 (the exact price depends on the region chosen). So we simply multiply 3TB by the $20 to get the storage cost per month.
3 * 20 = $60
To calculate on-demand query pricing, using the same query data, the price of 1TB is $5. So we simply multiply 3TB by $5 to get the on-demand query price per month.
3 * 5 = $15
The price may decrease if you haven’t used the 1TB added to your account for free each month.
How much does it cost to process 1TB in BigQuery?
For on-demand query pricing, 1TB costs $6.25.
How much does it cost to run a 12 GiB query in BigQuery?
12GiB is approximately 0.01288 TB. Since 1TB costs $6.25, 12 GiB will cost:
6.25 * 0.01288 = $0.08
How much does it cost to run a 100GB query in BigQuery?
100GB is approximately 0.107 TB. To find out how much it costs to run 100GB, we make the following calculation:
6.25 * 0.107 = $0.66
Do views cost extra in BigQuery?
No. Virtual tables are defined by SQL queries as views. In the same manner that you can query a table, you can do the same with views.
Views may only provide data from the tables and fields that are explicitly requested by the user when queried. The total quantity of data in all table fields referred to directly or indirectly by the top-level query determines how much a query will cost to execute.
There is no fee for adding or removing a view.
BigQuery import data cost
Data ingestion: BigQuery supports two data ingestion mechanisms.
- In a single batch process, you can load the source data into one or more BigQuery tables.
- You can stream a single record at a time or in tiny batches.
Data extraction: BigQuery supports two data ingestion mechanisms.
- In bulk, you can do a batch export of data from tables to Cloud Storage.
- Using the Storage Read API, you can stream reads of table data.
Bulk imports into BigQuery and exporting data from BigQuery do not cost anything by default, these tasks make use of a shared resource pool.
Neither the capacity of this shared pool nor the throughput you will experience is guaranteed by BigQuery. Dedicated slots for running load tasks are also available for purchase.
BigQuery API cost
The pricing model for the Storage Read API is on-demand pricing. On-demand pricing is entirely usage-based, with all customers receiving a complimentary tier of 300TB per month.
However, you will be charged on a per-data-read basis on bytes from temporary tables as they are not considered part of the 300TB free tier. Even if a ReadRows function fails, you pay for all of the data read during a read session.
If you cancel a ReadRows request before the stream’s conclusion, you will be billed for any data read before the cancellation.
Tips for optimizing your BigQuery cost
As long as you’re utilizing on-demand pricing, you can cut down on the amount of data a query needs to execute without affecting performance. The same is true with flat-rate pricing, regardless of how many slots you buy or what kind of commitment plan you choose, you can optimize your queries and cut down on slot use.
BigQuery clustering to reduce cost
The use of clustering and partitioning can help to reduce the amount of data processed by queries.
To limit the number of partitions scanned when querying clustered or partitioned tables, use a predicate filter. You may execute queries on a subset of your data relevant to your query and reduce the query cost by partitioning your data based on the date.
BigQuery custom cost control
If you have many BigQuery projects and users, you can control expenses by setting a custom quote limit – the quantity of query data users can process each day.
Custom quotas set at the project level restrict the total amount of data that all users within that project may use. Custom user quotas are assigned to individual users or service accounts within a project.
How to stream data into BigQuery without incurring a cost?
Utilize streaming inserts only when quick access to your data is required. Data loading into BigQuery is entirely free, however, streaming data into BigQuery incurs a fee. Unless you need instant access to your data, it is better to load it than to stream it.
Other best practices for optimizing BigQuery price
Whether you import data from Google Sheets to BigQuery using Coupler.io or fetch it from any other source using numerous BigQuery integrations, you must follow certain best practices to optimize cost in BigQuery. These include:
SELECT *sparingly in your queries and just query the information you require.
- When you want to see a small sample of your data, use BigQuery’s preview function instead of running a query to see a small piece of your data.
- Before executing any query or storage activity, make sure you understand the associated charges using the GCP Price Calculator.
- Split your query into smaller parts if you’re planning to query a large dataset. It’s better to run smaller queries one by one. It will lower the amount of data to be read, which effectively saves you money.
Thanks for reading!Back to Blog