Virtualize Data from a Managed SAP HANA Cloud, Data Lake to an SAP HANA Cloud, SAP HANA Database
- How to create a remote source in SAP HANA Cloud, SAP HANA database to a managed data lake
- How to virtualise data from a managed data lake to SAP HANA Cloud, SAP HANA database
- Two running SAP HANA Cloud, SAP HANA database instances in a production environment, one having a managed data lake.
- Completion of the previous tutorial in this group is recommended.
- Since this tutorial requires two SAP HANA Cloud, SAP HANA database instances, you need to have a productive environment of SAP HANA Cloud, SAP HANA database as trial only allows to have one instance.
In this tutorial, you will learn how you can connect an SAP HANA Cloud, SAP HANA database instance to multiple different data lakes. When you provision an SAP HANA Cloud, SAP HANA database instance, you can provision a managed data lake as well.
If you have multiple SAP HANA database and data lakes instances, you can easily virtualise data from one data lake to another or to another SAP HANA database instance. Let’s say for example, you have two SAP HANA database instances each with a managed data lake in SAP HANA Cloud. If you now want to virtualise data from one data lake to an SAP HANA database instance that is not its managed data lake, you can connect to it via its associated SAP HANA database instance. Through that SAP HANA database instance, the data lake will then be connected.
Besides the options described in this tutorial, you can also connect directly to a standalone SAP HANA Cloud, data lake. To learn more, we recommend this mission.
The following steps will show you how to connect an SAP HANA Cloud, SAP HANA database instance to an SAP HANA cloud, data lake that is managed via a second SAP HANA Cloud, SAP HANA database.
- Step 1
As with other remote sources, connecting to an SAP HANA Cloud, data lake requires a certificate to be stored in your PSE. In this step, you will obtain the certificate you need.
First, in SAP BTP Cockpit or SAP HANA Cloud Central, navigate to the SAP HANA database instance that is corresponding to the SAP HANA Cloud, data lake you want to connect to. Copy the SQL endpoint of the SAP HANA database instance corresponding to this data lake and paste the endpoint information in a text editor.
In this SAP HANA database instance, make sure to have a user who has permissions to access the database and data lake instances. You can read here how to create users in SAP HANA Cloud, SAP HANA database. You will need the credentials of this user later in this tutorial.
Next, you need to create a certificate and put it in a certificate store so it can be used for creating a remote source.
curl -O https://cacerts.digicert.com/DigiCertGlobalRootCA.crt.pem
After the file has been downloaded, you can access it with a text editor such as
Apple TextEdit. Copy the certificate string to your clipboard.
- Step 2
- Now, in the SAP HANA Database Explorer that is connected to your SAP HANA database instance, open a SQL console.
Create a certificate store, also called PSE (personal security environment), if you have not done so already.SQLCopy
CREATE PSE <certificate store name>;
- Create a certificate for the SAP HANA Cloud, data lake you want to connect and name it (
HDLin this example) using SQL statement below. You need to paste the certificate string you retrieved in Step 1 as a single line without any line-breaks (already included in this statement):SQLCopy
Next, get the certificate ID of this certificate by running this SQL statement:SQLCopy
SELECT CERTIFICATE_ID FROM CERTIFICATES WHERE COMMENT = 'HDL';
Add this certificate to the certificate store by inserting the certificate ID into the SQL statement:SQLCopy
ALTER PSE SSL ADD CERTIFICATE <certificate_id>;
Now set the PSE purpose as a remote source (unless you already have when following the instructions in the previous tutorial). This way, all remote sources you create will use the certificates stored in the PSE. Please note, that only one PSE can be set as a remote source purpose.SQLCopy
SET PSE SSL PURPOSE REMOTE SOURCE;
- Step 3
To now create a remote source, copy and paste the following statement into your console and replace the parts
with your specific information (see instructions below the code).SQLCopy
-- create a remote source CREATE REMOTE SOURCE <REMOTE_SOURCE_NAME> ADAPTER hanaodbc CONFIGURATION 'Driver=libodbcHDB.so;ServerNode=<INSERT_HC_INSTANCE_SERVER_NODE>; encrypt=TRUE;' WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=<USERNAME>;password=<PASSWORD>';
a. Specify a name for the remote source connection.
c. For CONFIGURATION, paste the endpoint information you copied in step 1.
d. Enter the credentials of the user you obtained in step 2.
e. For more detailed documentation on how to create a remote source to your SAP HANA Cloud, SAP HANA database click here.
Now, when you click on remote sources in your catalog, you should see the other SAP HANA Cloud, data lake in your remote sources in the bottom left panel.
- Step 4
- Click on the new remote source you created in the bottom left panel.
- In the new panel that has opened, select the schema in the data lake where the virtual tables are located that you want to federate data from by selecting it from the dropdown menu at the top. Then click on Search.
- Here, you can now see the virtual tables in your SAP HANA Cloud instance that are pointing to the source tables in the SAP HANA data lake you connected to.
- To federate the data from those virtual tables to the SAP HANA database instance you are connecting it to, you need to create a virtual table over a virtual table. This is called federation on federation. It means that the virtual table will point to the virtual table in the SAP HANA Cloud, SAP HANA database instance which in turn points to the source table in the SAP HANA Cloud, data lake. To create the virtual table, run this statement: SQLCopy
-- create a virtual table on a virtual table (federation over federation) create virtual table <TARGET_TABLE_NAME> at "<REMOTE_SOURCE_NAME>"."<NULL>"."<SCHEMA>"."<VIRTUAL_SOURCE_TABLE> "; --you can drop this table using this statement: --drop table <TARGET_TABLE_NAME>;
Once this step is finished, under Tables in your Catalog, you should see the new table that is a virtual table connecting to the virtual table in the other SAP HANA database instance, which is pulling data from the source table in its corresponding SAP HANA Cloud, data lake.
To learn more about SAP HANA Cloud, data lake, you can refer to this mission.
You have completed the fourth tutorial of this group! Now you know how to virtualise data from data lakes to your SAP HANA Cloud, SAP HANA database instance.
When you query this virtual table on a virtual table, the results will take a bit more time to load. That’s why the next tutorial will focus on how to improve the performance of queries accessing multi-layered federation tables.
- Step 5
What is a multi-layered federation?