Back to Blog

Firestore to BigQuery – Everything You Need to Know

When building a demanding application, it’s important to use all the necessary tools to make it work as optimized as possible. While NoSQL databases provide all the required mechanisms for storage and retrieval of non-relational data, there are some cases where more specialized tools like BigQuery are needed.

This article covers Cloud Firestore, a NoSQL cloud database, and all the ways to move your data from Firestore to BigQuery. Understanding it will make your data analysis processes a lot easier.

What is Firestore?

Cloud Firestore is a cloud NoSQL database service provided by Google and Firebase. It’s used by mobile, web, and server applications to store and sync data. It offers great flexibility and scalability features and it keeps your application data in-sync across multiple client applications.

How Firestore stores data

But how does it work, you may ask? Well, Cloud Firestore is a cloud-hosted database that stores data in documents that contain fields mapping to values.

These documents are stored in collections which can be used to better organize your data. You can think of collections like database tables and documents as the rows within a table. The only difference here is that documents (within a collection) support many different data types and they don’t have to contain the same fields. The Cloud Firestore data model supports whatever data structure works best for your app.

Different methods to load data from Firestore to BigQuery

While Cloud Firestore can support the scalability of your application, it’s not ideal when you want to analyze data. This is where BigQuery comes in, which is designed exactly for business agility.

There are two main approaches to loading data from Firestore to BigQuery:

  • Export Firestore data to Google Cloud storage and then import to Google BigQuery
  • Stream data from Firestore directly to BigQuery.

The first approach can be performed in two ways – via the UI and via the BigQuery command line.

We’ll demonstrate each of these methods. For these examples we will use a Firestore database with two collections:

  • Restaurants: Containing a set of documents with various restaurant information (e.g. city, ratings, price range, etc)
  • Visitors: Containing a set of documents with visitor information (e.g. name, time visited, etc.)

Export Firestore data to BigQuery via UI

The process of exporting data is really straightforward, even though there are a few things we need to pay attention to. Before using the import/export service and export your data, you must first ensure the below:

  • The Google Cloud Project where the Firestore database belongs has enabled and working billing details. Even though the export process doesn’t cost anything, the Google Cloud Storage service requires billing details in case the exports exceed the free tier threshold.
  • Have a Google Cloud Storage bucket where we can store the exports in the same region as the Cloud Firestore database location.
  • Make sure you have the required permissions in both Cloud Firestore (Owner, Cloud Datastore Owner, or Cloud Datastore Import Export Admin) and Cloud Storage (Owner or Storage Admin).

After you ensure the above, then the process gets really simple:

  • Click the Export entire database option.
  • Below Choose Destination, enter the name of a Cloud Storage bucket or use the Browse button to select a bucket.
  • Click Export.

This will create a folder within your selected cloud storage bucket that will contain the below:

  • A file with the name of the folder ending in export_metadata
  • A folder with all the namespaces and kinds.

In order to load something in BigQuery, we will need the export_metadata file. So following the below process, we can import everything:

  • From the Create table from, select Cloud Storage and choose an export_metadata file generated during the export. The overall_export_metadata created outside of the all_namespaces folder cannot be used as it’s not recognized by BigQuery. Make sure the file format is Cloud Datastore backup.
  • Provide a table name for the destination table.
  • Leave everything else as is and click “Create table”.

How to export specific collections to BigQuery

There are some cases where you might need to export only a specific collection. To do that, you can follow the above process, and in the 3rd step select “Export one or more collection groups”: 

Move data from Firestore to BigQuery via the command line

As we mentioned earlier, you can also perform the above via the BigQuery command line. You can import your Firestore data using Google Cloud’s bq command. 

First, you need to gather 4 requirements:

  • LOCATION: The region where your BigQuery dataset belongs to.
  • FORMAT: This should always be DATASTORE_BACKUP. This is because Datastore Backup is the correct option for Firestore.
  • DATASET: This is the dataset where the table which is going to contain the data belongs to.
  • TABLE: This is the table name where you are going to load the data. If you do not have a table, it will be created automatically.
  • PATH_TO_SOURCE: Replace it with the Cloud Storage URI of the export file. This URI has the following format of gs://firestore_example/20211011T1256/default_namespace/kind_restaurants/default_namespace_kind_Restaurant.export_metadata

As soon as you have this information, you can execute the below in Google Cloud Command line:

bq --location=europe-west3 load

--source_format=DATASTORE_BACKUP

firestore_example.restaurant_data

gs://firestore_example/20211011T1256/default_namespace/kind_restaurants/default_namespace_kind_Restaurant.export_metadata

Voilà! You now have all your Firestore database exported in BigQuery.

How to automate Firestore to BigQuery export

The above options are a great way to export your Firestore data and import them back in BigQuery as a one-off. But there might be some cases where you need to automate Firestore to BigQuery export process and continuously update your export with new updates. 

Automating Firestore to BigQuery exports can remove an extra process from your pipeline and keep your BigQuery dataset up to date. Every time you need to perform an analysis, you can do it on your latest version of data.

The process of exporting your data and updates in real-time is called streaming and luckily, there is an extension that helps you stream your Firestore data directly to BigQuery.

How to stream Firestore data to BigQuery

In order to stream Firestore data to BigQuery and update your BigQuery data when a document or a collection changes, you can use the “Export Collections to BigQuery” Firebase extension.

This extension exports the documents in a Cloud Firestore collection to BigQuery in real-time and incremental. It scans for document changes in the collection and automatically sends the action to BigQuery as well (document creation, deletion, or update).

In order to install this extension, you can:

  • Go to the extension page.
  • If you have already set up billing hit “Next”
  • Review which APIs will be enabled and hit “Next”.
  • Review all the access rights provided to the extension and hit “Next”.
  • Enter the configuration information:
    1. Cloud functions location: Usually the same location as your Firestore database and BigQuery dataset
    2. BigQuery dataset location: The location where BigQuery dataset can be found.
    3. Collection path: The collection you want to stream. You may use {wildcard} notation to match a subcollection of all documents in a collection, for example, businesses/{locationid}/restaurants. Read more about wildcards.
    4. Dataset Id: The BigQuery dataset where your table will be placed.
    5. Table Id: The BigQuery table where your data will be streamed.
    6. BigQuery SQL table partitioning option: Add in case you need your BigQuery table partitioned.
  • Click “Install extension”.

How to backfill your BigQuery dataset

The above extension will start sending new information to BigQuery after it’s installed. This means that your full dataset will not be exported. You can only backfill your BigQuery dataset with all the documents in the collection if you are familiar with coding and by running this custom import script created by Google specifically for this case.

Please note that the import script should run after installing the above extension, otherwise if there are any writes to the database during the import, they won’t be included in the export.

Wrapping up – Firestore to BigQuery

In this article, we went through all the available options on how to move your data from Google Firestore to Google BigQuery. We’ve seen what Firestore is, how it stores data, and where it is frequently used. Then, we’ve seen the different methods to move your data as a one-time thing and through streaming.

Now that your Firestore data is already in BigQuery, you may be eager to couple it with more of your business information. 

Coupler.io with its BigQuery integrations offers an easy way to automate the import process into BigQuery. You can fetch data from Airtable, Pipedrive, Hubspot, QuickBooks, Shopify, and many other sources.

The imports run automatically, on a schedule that works best for you. We don’t have a Firestore integration yet but we may add it in the future if there’s an interest in our community. Would you use it? If so, let us know via this form.

Back to Blog

Comments are closed.

Access your data
in a simple format for free!

Start Free