A data store, such as Google BigQuery, is used to consolidate data from various sources into a single location. It is fully operational and maintenance-free, and integrates seamlessly with the Google Cloud Platform. Tableau, with its intuitive, drag-and-drop user interface, will then be used to interpret and visualize the data in a user-friendly format.
Tableau and Google BigQuery’s integration enables businesses of all sizes to reach the highest-quality data visualizations, communications, and research, as well as professionally-organized and flexible scientific data warehousing. This article attempts to answer all of your concerns about integrating Google BigQuery and Tableau.
Different ways to connect Tableau to BigQuery
There are two ways by which Tableau connects to BigQuery.
- Live Connection
- Extract Connection
In a live connection, the loaded Tableau workbook sends a query to the BigQuery API, which returns data to Tableau. All of this is achieved in real-time as the user communicates with Tableau. This type of connection is mostly used when:
- Your data changes frequently, which necessitates real-time reports.
- The data set is large enough to require a long extraction process.
For an extract connection, Tableau can periodically (monthly, weekly, or hourly, for example) query BigQuery for all the underlying details of the tableau report. It then saves this data in a memory analysis engine. When a user loads or interacts with a report, Tableau routes the request to the memory engine rather than to Google’s BigQuery. The extract connection is often used in the following situations:
- When it is important to provide a fast load time for workbooks and filter changes.
- Data is not updated on a daily or continuous basis
- The scope of your report is minimal, as well as the amount of data collected
- Query cost is expensive, as BigQuery charges you per query and the amount of data scanned
We’ll explain how you can choose the desired connection below.
How to set up a Tableau BigQuery integration
Before you start, you will need this necessary information.
- Fundamental knowledge of Google BigQuery. You can check our blogpost on how to setup your BigQuery.
- A data source from BigQuery that you want to connect with Tableau.
- Fundamental knowledge of Tableau.
Tableau and Google BigQuery configuration
- Begin by launching either your Tableau Desktop, Tableau Online, Tableau Prep or Tableau Server.
- Select “Google BigQuery” in the “To a Server” section.
- A default browser window where you need to provide the details for connecting to Google BigQuery opens up.
- Sign in to your Google account with your email and password. If you have more than one account, you will need to sign in with the account that contains the Google BigQuery data that you will want to upload in Tableau.
- Provide Tableau access to your Google BigQuery data by clicking on Accept. Then close the browser.
Data source configuration
After successfully logging in and connecting your Tableau and Google BigQuery, the data source page is loaded. We hope that you already have some data in your dataset, otherwise you can easily import data to BigQuery from different sources.
- Select the name option and assign the database you are using a specific name. It is considered good practice to provide a distinctive name, as it facilitates the identification of the database from which data is retrieved.
- Optionally, you can select Billing project from the billing project drop-down list. If you do not want to select a billing project, then “empty project” will automatically appear in that area by default.
- Pick a project from the Project drop-down column and choose public data. This provides a connection to a sample data set in BigQuery.
- Then, select the desired dataset from the Dataset dropdown list.
- Finally, under Table, find and choose the BigQuery table to connect to.
This is how Google BigQuery can be linked to Tableau. And here is how you can choose between live and extract connections.
Setup live or extract BigQuery to Tableau connection
In the top-right corner, you can change the connection type from Live to Extract. It usually defaults to Live connection.
If you need to select Extract connection, do the following:
- Click the Edit icon to view the Extract Data dialog box.
- You can configure the following options:
- How to store your data:
- Logical Tables
- Physical Tables
- How much data to extract
- Set filters to specify the amount of data to extract
- Aggregate the data in the extract
- Choose the number of rows to extract
- How to store your data:
- Click OK when you have completed the configuration.
- Next, click the sheet tab below to begin the extraction process.
- In the dialog box, select a location to save the extract, give the extract file a name, and then click Save.
That’s it. Now, are you ready to test your BigQuery to Tableau connection?
Testing your connection
To test your connection, you can use Custom SQL to write a new query. Custom SQL features allow you to narrow the scope of your SQL queries, rather than querying the whole database.
- From the left panel, click the New Custom SQL option.
- This opens up a dialog box where you’ll need to input your query.
- Click on OK to run.
- To see your data, click Update Now.
Note: While Google BigQuery supports both Standard SQL and Legacy SQL, while building a connection in Tableau, it defaults to using Standard SQL. However, you can conveniently swap between the type of SQL you need.
Benefits of using Tableau with BigQuery
- Create awesome dashboards that link to your Google BigQuery data in minutes and keep your business up to date.
- Combine Google BigQuery’s cloud-native performance with Tableau’s immersive visualization tools to determine the value of a project.
- Tableau and Google BigQuery make it possible for daily users to examine large numbers of rows without writing a single line of code and with no server maintenance.
- Ease of sharing reports and insights to anyone on different platforms and devices.
- Business users have real-time access to key data used to make decisions quickly, without needing to involve analysts to generate basic insights.
Techniques to optimize Tableau & BigQuery performance
Techniques for Tableau
Use of a Performance Recorder
A Performance Recorder is a robust built-in instrument that allows the monitoring and optimization of sluggish queries and workbooks. This is accomplished by keeping track of the time required to fulfill a request, and computing the structure of a particular workbook. When a sluggish query is identified, it is often possible to address the performance problem by revisiting the data model.
The Workbook, Dashboard, and Worksheet columns in the Timeline display the events while recording performance.
Events with longer lengths will assist you in determining when to begin while attempting to accelerate your workbook.
Use of customization attributes
By configuring parallel queries with customization attributes, you can optimize the output of massive result sets returned from BigQuery to Tableau. These customization attributes may be used in a published workbook or database, as long as they are defined before publishing to Tableau. For more information on this, refer to this guide.
Use of context filters
Setting up context filters improves accuracy when filtering a massive database. If you’re creating filters that substantially minimize the data collection size, and you plan to use them on more than one data view, you can set one or more context filters as this helps to improve performance.
Search for warnings
When you attempt to put a big dimension (with several members) on any shelf, Tableau shows a performance alert dialog box. If you want to include all members, there is a possibility that results may suffer significantly.
Disable automated updates
After placing a field on the shelf, Tableau instantly produces the view by querying the data source. If you create a dense data display, the queries can be lengthy and greatly impact device efficiency. You should, however, instruct Tableau to disable queries when you build the view. When you are happy to see the result, you can then re-enable queries.
Use of aggregate measures
If your views are slow to develop, check that you are using aggregated rather than disaggregated measures. When views are sluggish, that is normally because you are attempting to display a large number of rows of data simultaneously. By aggregating the results, you will reduce the number of rows.
Techniques for Google BigQuery
In BigQuery, you typically want to denormalize the data structure to enable super-fast querying. While JOINs on small datasets are possible with BigQuery, they’re not as performant as a denormalized structure.
Sharding is the process of dividing a table into smaller partitions. This contributes to the simplification of data processing and the improvement of query efficiency. Additionally, BigQuery supports clustering over partitioned tables, which is advantageous if your data is already partitioned by a date or timestamp column or if your queries include filters or aggregation on specific columns.
Certain types of data adapt themselves inherently to date partitioning – for example, log data or other data whose records have an increasing timestamp. In the scenario below, partition the BigQuery tables according to their creation date and use the creation date in the table name. To take advantage of this, you’ll need to use Tableau’s custom SQL.
For example, name your tables something like
mytable_20200502, etc. Then, when you choose to run a query that includes a date filter, call the Wildcard Table feature in BigQuery:
SELECT name FROM 'myProject.myDataSet.mytable_*' WHERE age >= 20
To make use of the wildcard, your tables must be named as follows:
Specify destination table when running similar queries
Although database caching is advantageous if you frequently run equivalent queries, it is ineffective if you frequently run related but slightly different queries (e.g., changing only the values in a WHERE clause between query runs). Execute a query on the source table and copy the records that will be accessed frequently to a new destination table. Then, execute queries against the newly generated destination table.
For instance, suppose you intend to run three queries with three distinct WHERE parameters
WHERE col1 = "x" WHERE col1 = "y" WHERE col1 = "z"
Execute a query against the source table and export the results to a destination table:
SELECT col1 FROM source WHERE col1 = "x" OR col1 = "y" OR col1 = "z"
By concatenating the WHERE clauses, we capture all important records.
Our current destination table could be somewhat smaller than the source table. Since BigQuery costs by the sum of data processed in a query, running additional queries against the current destination table saves money and improves performance in comparison to running them explicitly against the source table.
Best practices to consider when using Tableau and BigQuery
- Use Tableau server: To scale Tableau to maximum functionality, you must enable Tableau Server in your environment as this reduces latency(delay in transferring of data). Data administration, data protection, and data management are part of what Tableau Server offers.
- Use federated or external data sources: You can benefit from the ability of BigQuery to query external data sources for data stored in Google cloud, and point them to Tableau. This reduces the volume of data that must be sent to Tableau for analysis.
- Process massive data with a live connection: It is necessary, when dealing with large data sets, to keep the connection style as a live connection instead of an extract connection unless you have a specific reason to use an extract connection.
Google Sheets to BigQuery to Tableau
On the Tableau community, users share different interesting cases and ask for help. We stumbled upon one of them and decided to offer our solution. So, a user has a Google Sheets spreadsheet that populates a BigQuery table. Then, she needs to connect a specific BigQuery view to Tableau for further data processing. Here is how you can have this flow automated:
- Export data from Google Sheets to BigQuery on a schedule using Coupler.io.
- Connect BigQuery to Tableau as described above using either live or extract connection.
With Coupler.io, a data import tool, you can fetch data from apps, APIs, and other sources, and import it into Google Sheets or BigQuery. It’s very easy to use and needs no coding. The main feature is that the data export can be automated on a custom schedule – for example, every week at 9 a.m. For more on this, refer to our blog on how to Export queries from BigQuery… and get your data into BigQuery.
Numerous data enthusiasts are taking advantage of this BigQuery to Google Sheets integration to streamline their data transfer process. Others benefit from the Tableau to Google Sheets integration that we’ve also blogged about. Hopefully, this method will prove to be valuable to you as well. Best of luck!Back to Blog