BETWEEN BigQuery How To Guide
A wide range of queries are available through BigQuery to assist us in getting relevant information from large sources of data. For example, there may be an instance when we need to extract a range of values from all of the columns in a table that contains information on goods shipped to your store. Here is where we would use the Between operator to perform this kind of data operation. In this article, we go further into how to use Between in BigQuery.
BigQuery Between operator
To determine if an expression falls within a certain range of values, the BigQuery Between operator is used. Due to the inclusive nature of this operator, it includes both the beginning and ending values of the range. The values may be of any kind, including text, numeric, date data, etc. This operator can be used in conjunction with the SELECT, WHERE, INSERT, UPDATE, and DELETE statements.
BigQuery Between Syntax
This operator is most commonly used in conjunction with the WHERE command. The syntax for this is as follows:
SELECT column_name(s) FROM table_name WHERE column_name BETWEEN value1 AND value2
SELECTstatement indicates which columns are to be obtained.
- Using the
FROMstatement, you may specify which table contains the columns you want to get.
BETWEENoperator is indicated in the
column_nameargument represents the column where we want to apply the range condition for the start value and the end value.
BigQuery Between Example
To get an idea of how the Between operator works, let’s consider this example: we have this table that contains the name and price of goods.
We will use the Between operator to select all the names of goods between ‘Glossy Pink‘ and ‘Shiny Brown‘. In this instance, the Between operator will return a result with all the names that fall within the range and also include the specified names.
SELECT * FROM `original-glyph-321514.table1.goods` WHERE Name BETWEEN 'Glossy pink' AND 'Shiny brown'
From the example above we can see that the Between operator found the values within the range ‘Glossy Pink‘ and ‘Shiny Brown‘. It doesn’t return ‘black velvet‘ because alphabetically B comes before G and does not fall in the range of G to S. Let’s look into more examples.
BigQuery Between two dates
To show how BigQuery Between works, we used Coupler.io to import an example dataset including date data from Google Sheets to BigQuery.
Coupler.io is a solution for automating data exports from a variety of applications and sources to BigQuery, Excel, or Google Sheets. After import, it appears as follows:
Note: With Coupler.io you can import from many sources to BigQuery so be sure to check out other BigQuery integrations.
With the table above, we can look into more examples of how the Between operator can be used.
To find the BigQuery between two dates let’s check out the statement below. We specified two random dates from the table and we want to return all the dates between them using the Between operator.
SELECT date FROM `original-glyph-321514.table1.contractors` WHERE date BETWEEN '12/11/2021' AND '20/11/2021'
From the result above we get all the dates between the two specified dates.
SQL BigQuery timestamp between dates
Still referencing the data from the previous table, we will select two timestamps: ‘2021-11-01 12:35:43‘ and ‘2021-11-12 14:12:11‘. It is important to note that time and date are both included in the timestamps. For this example, we want to extract only the date from the timestamp range. We can do that using the Between operator as seen below:
SELECT date FROM `original-glyph-321514.table1.contractors` WHERE date BETWEEN ' 2021-11-01 12:35:43 ' AND '2021-11-12 14:12:11'
Google BigQuery between dates inclusive
Both the start and finish values are included when using the Between operator. It is possible to use the Between operator to include or exclude values.
In the example shown below, we will find the dates between ‘01/11/2021‘ and ‘16/11/2021‘. The Between operator will include the specified values in the result that it produces.
SELECT date FROM `original-glyph-321514.table1.Contractors` WHERE date BETWEEN '01/11/2021' AND '16/11/2021'
From the result above, the dates ‘01/11/2021‘ and ‘16/11/2021‘ were returned alongside other values.
BigQuery range between 2 dates
In this example, we will still be referencing our table above. Using the Between operator, we can get a range of values between two specified values. To find the range between the two dates ‘10/11/2021‘ and ‘15/11/2021‘ we will use the following statement below:
SELECT date FROM `original-glyph-321514.table1.contractors` WHERE date BETWEEN '10/11/2021' AND '15/11/2021'
Can I query between datasets for BigQuery?
For numerous tables or datasets, the Between operator is not immediately relevant. However, there are alternative methods to query datasets, such as combining the tables together and querying them using the JOIN clause. Also, you can query the two tables together using the UNION operator. Let’s have a look at an example of how to achieve this below.
BigQuery difference between 2 tables
In this example, we’re going to find the difference between the values in two tables. For this, we will use the two tables below, ‘Class 1‘ and ‘Class 2‘. To find the difference between these two tables we will be using the UNION DISTINCT operator. This will compare the two tables and return all the values except duplicate values.
We will use the UNION DISTINCT operator to find the difference in the ‘age‘ value between the two tables.
SELECT age FROM `original-glyph-321514.table1.Class1` UNION DISTINCT SELECT age FROM `original-glyph-321514.table1.Class2`
BigQuery difference between 2 rows
In this example, we want to calculate the difference between values in different rows. Let us consider this Production Table:
In this table, we have the production value and the year for each value. To determine the difference between the current and previous rows in the production value, we will order the data by year.
BigQuery offers the
first_value function which returns the first row given the order provided by us. For this example, the order we provide is ‘year‘.
Note: Each row that comes after the first_value is relative to the first row. Here is how it works:
SELECT `original-glyph-321514.table1.Prod`.country, `original-glyph-321514.table1.Prod`.year, `original-glyph-321514.table1.Prod`.production, `original-glyph-321514.table1.Prod`.production - first_value(`original-glyph-321514.table1.Prod`.production) OVER (ORDER BY `original-glyph-321514.table1.Prod`.year) AS diff_first FROM `original-glyph-321514.table1.Prod` WHERE `original-glyph-321514.table1.Prod`.country = 'USA' LIMIT 10
From our result, the first value given by BigQuery is 0. To verify our results we can do the simple math of the rows in the production value. So, for the next row, it would be ‘
1045-1039 = 6‘ then ‘
1050-1039 = 11‘ … etc.
From this result, we can see that for each row the results are relative to the first-row value.
Pros and Cons of the BigQuery Between Operator
One of the primary limitations of the Between operator is that it cannot simultaneously query several databases and tables. Also, if you use the Between operator with any other data type, such as boolean, string value range, or date, you will get no result.
Aside from these limitations, the major benefits of the Between operator are:
- It is useful for making comparisons with the present timestamp.
- It enables you to set a test range.
- The SQL Between condition makes it simple to determine if an expression falls inside a specified range of values (inclusive).
From the Between operator syntax, you can customize your queries to retrieve the type of information that you need. Also, you can use Coupler.io to easily import data from any sources to BigQuery. Have fun testing out your queries!Back to Blog