Skip to Content

Getting Started with Data Lake Files HDLFSCLI

Requires Customer/Partner License
Learn how to setup the SAP HANA Data Lake file container command line interface and use the it to manage your data files.
You will learn
  • How to install and use the SAP HANA Data Lake file container Command Line Interface (HDLFSCLI)
  • Use the HDLFSCLI to put, manage, and remove files from an SAP HANA Data Lake File Container
sap-danielDANIEL UTVICHAugust 5, 2022
Created by
sap-daniel
March 7, 2022
Contributors
rnagweka
sap-daniel

Prerequisites

  • A licensed SAP HANA Data Lake instance (non-trial / non-free tier)
  • Access to SAP Software Center
  • Basic understanding of the public key infrastructure (PKI)
  • Download the sample TPCH Data
  • Step 1

    The HDLFSCLI is included in the HANA Data Lake Client download from the SAP software center. The first step is to download and install the latest version of the HANA Data Lake client.

    HANA Data Lake client search

    The latest HANA Data Lake Client package can by identified by the most recent release date.

    HANA Data Lake software center client installer

    Once you’ve identified the correct package for your operating system, download it.

  • Step 2
  • Step 3

    To connect the HDLFSCLI to a HANA Data Lake file container, a certificate will need to be generated to make a secure connection. Below are the steps required to create a self-signed certificate to get started using the HDLFSCLI. You will require an installation of OpenSSL. Use your preferred Linux package installer to install OpenSSL if it is not already installed. If you’re using a Windows machine, then Windows Subsystem Linux will have OpenSSL installed. Alternatively, OpenSSL can be installed for Windows. OpenSSL for Windows can be downloaded from Here.

    Then, follow these steps to creating your self-signed certificate.

    Make sure the certificate fields are not all exactly the same between the Certificate Authority (CA) and client certificates, otherwise it is assumed to be a self-signed cert and the cert validation below will fail.

    Create a private key for the CA (2048 bits).

    openssl genrsa -out ca.key 2048

    Create the CA’s public certificate (valid for 200 days). Provide at least a common name and fill other fields as desired. Fields can be skipped; it is not necessary to fill out every field.

    openssl req -x509 -new -key ca.key -days 200 -out ca.crt

    Create a signing request for the client certificate.

    Provide at least a common name and fill other fields as desired. Also, leave the email-Id field blank.

    openssl req -new -nodes -newkey rsa:2048 -out client.csr -keyout client.key

    Create the client certificate (valid for 100 days)

    openssl x509 -days 100 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt

    Verify the certificate was signed by the CA.

    openssl verify -CAfile ca.crt client.crt

    To obtain the subject string of a certificate in the RFC2253 format used in HDL Files authorizations (omit the “subject=”prefix).

    Note: You will need this later when you configure authentication for HDL Files.

    openssl x509 -in client.crt -nameopt RFC2253 -subject -noout

  • Step 4

    Navigate to the SAP HANA Cloud Central Cockpit and select “Manage File Container” on the HDL instance.

    SAP HANA Cloud Central Cockpit Manage File Container button

    Select the edit button at the top of the page.

    Manage File Container edit button.

    Here is a full view of the screen showing the location of “Trusts” and “Authorizations”.
    You may need to scroll down to find these.

    Edit file container page displaying the Trusts and Authorization locations.

    Click on “Add” under Trusts configuration and hit on “Upload” file button and browse to the location where your ca.crt is located and upload that file and click on apply.

    The alias can be anything, but the certificate should be exactly what is in the generated ca.crt.

    Add Trust modal.

    Click on “Add” under Authorizations and select the roles as “Admin” or “User” and then click on “Generate pattern” from the output of the following command. (exclude the “subject=” prefix):

    openssl x509 -in client.crt -nameopt RFC2253 -subject -noout

    Alternatively, you can use the “Generate Pattern” option and similarly upload the client.crt file after clicking on the “Upload” file option. It will automatically generate a pattern like above.

    Authorizations Generate Patter modal.

    Now click save at the bottom of the page.

    Manage file container save button.
  • Step 5

    Next, we will verify that the configuration we did in Steps 5 & 6 work.

    The <REST API Endpoint> and <Instance ID> can be found in the SAP HANA Cloud Central Cockpit. <PATH> is the path to the corresponding certificate.
    The following command wont return anything in the output.

    hdlfscli -cert <PATH>\client.crt -key <PATH>\client.key -cacert <PATH>\ca.crt -k -s https://<REST API Endpoint> -filecontainer <Instance ID> ls

    [Optional]: Configure a configuration file to make using the CLI simpler.

    Note: The configuration will be placed in the user’s root directory. It is saved as a JSON file that can be modified in any text editor.

    hdlfscli -cert <PATH>\client.crt -key <PATH>\client.key -k -s <REST API Endpoint> -config myconfig -dump-config ls

    Test the configuration that was just created.

    hdlfscli -config myconfig ls

    Upload a file to the SAP HANA Data Lake file container. Ensure you know the path to the TPCH data files that were downloaded in the prerequisites.

    hdlfscli -config myconfig upload <Your Local Path>\TPCH <Target File Path>\TPCH

    Verify that the files has been uploaded.

    hdlfscli -config myconfig lsr

    Now that the TPCH data files are in the Data Lake file container we can use SQL on Files to query the data. Learn how to do this in the tutorial “Use SQL on Files to Query Structured Data Files”.

    [Troubleshoot]: If anyone receives the following error while verifying the configuration, please do the needful as mentioned.

    Troubleshoot the configuration error.

    Copy the content of the Client field which is mentioned inside [ ] brackets and then go to your Data Lake Instance, click on edit configuration and scroll down to Authorizations and first delete the entire value from the “Pattern” field and now paste the Client field value here.

    Now, re-verify the configuration. It should work.

    Where can you find more information on hdlfscli commands?

Back to top