Access Data from Google BigQuery in SAP HANA Cloud, SAP HANA Database
- How to create a remote source in SAP HANA Cloud, SAP HANA database to Google BigQuery
- How to virtualise data from Google BigQuery to SAP HANA Cloud, SAP HANA database
- A running SAP HANA Cloud, SAP HANA database instance in a trial or production environment.
- A Google BigQuery service account, project and dataset containing at least one table.
- Completion of the previous tutorial in this group is recommended.
In this tutorial, you will learn how to add another non-SAP cloud data source to your multi-source data landscape in SAP HANA Cloud, SAP HANA database. If you have data in Google BigQuery, you can create a remote source connection after some initial preparations in Google BigQuery and then access it in SAP HANA Cloud, SAP HANA database via a virtual table.
In addition to the instructions below, you may also watch this video to follow along:
- Step 1
- Open the Google Cloud Platform and sign in with your account. You need to have an existing service account with a project and a dataset containing at least one table for the steps in this tutorial.
- In Google BigQuery, go to IAM & Admin and click on IAM.
Create a Member or use an existing one and assign either the
BigQuery Adminrole to this member or the roles
BigQuery Data Viewerand
Big Query User.
- Go to service accounts and click on the three dots under Actions in the row of the service account you want to use. Then click on Create Key.
A dialogue will open where you need to specify the key type. Select JSON and click on Create.
Open the file that is created (e.g. using Visual Studio Code or a simple text editor). There you will see all the information of the project and client including the credential information you will need in the next steps to create the connection.
- Copy this information from the
- Step 2
Regarding the SQL code
Please keep in mind that since Google BigQuery is using lower case characters, all object names you refer to in the SQL statements need to be put in
- Open the SAP HANA Database Explorer of your SAP HANA Cloud instance and open a SQL console connected to your database.
- Set the schema you want to work in.
- Open https://www.google.com and copy the Global Sign root certificate to a text editor.
Create a certificate store, also called PSE (personal security environment), if you have not done so already.SQLCopy
CREATE PSE <certificate store name>;
- Create a certificate for Google BigQuery and name it
GBQusing this SQL statement, paste the certificate you retrieved in the previous step as a single line without any line-breaks (already included in this statement):SQLCopy
Next, get the certificate ID of this certificate by running this SQL statement:SQLCopy
SELECT CERTIFICATE_ID FROM CERTIFICATES WHERE COMMENT = 'GBQ';
Add this certificate to the certificate store by inserting the certificate ID into the SQL statement:SQLCopy
ALTER PSE SSL ADD CERTIFICATE <certificate_id>;
Now set the PSE purpose as remote source. This way, all remote sources you create will use the certificates stored in the PSE. Please note, that only one PSE can be set as remote source purpose.SQLCopy
SET PSE SSL PURPOSE REMOTE SOURCE;
- Step 3
To now create a remote source to Google BigQuery, insert the information from the JSON file you downloaded from the Google BigQuery service account (i.e. client-email and private key) into the next SQL statement. In this example, we will name the remote source
-- create a remote source CREATE REMOTE SOURCE BIGQUERY ADAPTER bigqueryrest CONFIGURATION 'server=www.googleapis.com' WITH CREDENTIAL TYPE 'OAUTH' USING 'user=<INSERT CLIENT EMAIL>;password=-----BEGIN PRIVATE KEY-----<INSERT PRIVATE KEY>-----END PRIVATE KEY-----\n';
Once that is done, you should see now the new remote source named
BIGQUERYby selecting remote sources in your catalog in SAP HANA Database Explorer.
If you click on the remote source, a new window will open and, by clicking on Search, you can see all the tables listed as remote objects in the remote source.
- Step 4
To create a virtual table, you can use the statement below. Make sure to insert the name of the virtual table you want to create followed by the name of the remote source (
BIGQUERYin this example) as well as the information from the source side in Google BigQuery:
- project id (found in the JSON file)
- and table name
-- create a virtual table create virtual table <TARGET_VIRTUAL_TABLE> at "<REMOTE_SOURCE_NAME>"."<GBQ_PROJECT_ID>"."<GBQ_DATASET>"."<GBQ_SOURCE_TABLE> ";
Your virtual table is now created, as you can see under tables in your catalog.
To check, if the virtual table is working, you can use this statement:SQLCopy
select * from tpch.vt_gbq_supplier;
Remember, that if you want to query the data in Google BigQuery, all columns included in the query that are located in the virtual table pointing to Google BigQuery need to be written in “double quotation marks” since Google BigQuery has lower case characters set as default.
More resources on working with Google BigQuery
- SAP HANA Cloud Smart Data Access for Google BigQuery Blog Post
- Technical documentation on how to Create a Remote Source to Google BigQuery
You have completed the second tutorial of this group! Now you know how to create a connection from Google BigQuery to SAP HANA Cloud and how to create a virtual table accessing the data stored in Google BigQuery in SAP HANA Cloud.
Learn in the next tutorial how to virtualise data from a managed SAP HANA Cloud, data lake to your SAP HANA Cloud instance.
- Step 5
What SAP HANA Cloud tool can be used to virtualize data from Google BigQuery to SAP HANA Cloud, SAP HANA database?