Skip to Content

Create a Calculation View with Differential Privacy in SAP HANA Cloud

Use differential privacy to anonymize confidential data in SAP HANA Cloud.
You will learn
  • How to create Calculation View of type Cube using SAP Business Application Studio
  • How to configure data anonymization to a column in order to protect sensitive data
jung-thomasThomas JungApril 7, 2021
Created by
jung-thomas
January 8, 2021
Contributors
jung-thomas

Prerequisites

A video version of this tutorial is also available:

  • Step 1

    Using what you learned in Create an SAP HANA Database Project, you will create another database table and load sample data into it but without step by step instructions. Extrapolate what you learned from these same steps in the earlier tutorial to complete this task.

    1. Create a new database table named TGT_SALARIES using the hdbtable artifact type in the data folder. Use the following table definition.

      SQL
      Copy
      column table "TGT_SALARIES" (
          ID INTEGER  unique not null COMMENT 'Employee ID',
          SALARY DOUBLE COMMENT 'Salary',
          START_YEAR INTEGER not null COMMENT 'Starting Year',
          GENDER NVARCHAR(1) COMMENT 'Gender',
          REGION NVARCHAR(4) COMMENT 'Region',
          "T-LEVEL" NVARCHAR(200) COMMENT 'T Level',
          PRIMARY KEY ("ID")
      )
      COMMENT 'Employee Salary'
      
    2. Create the configuration for the csv upload. In the data/loads folder, create a file called salaryload.hdbtabledata with the following content

      json
      Copy
      {
        "format_version": 1,
        "imports": [{
          "target_table": "TGT_SALARIES",
          "source_data": {
            "data_type": "CSV",
            "file_name": "salarydata.csv",
            "has_header": true,
            "dialect": "HANA",
            "type_config": {
              "delimiter": ","
            }
          }
          }]
      }		
      
    3. Download this csv file – https://github.com/SAPDocuments/Tutorials/blob/master/tutorials/hana-cloud-calculation-view-differential-privacy/salarydata.csv into your computer. Upload it into the loads folder using the Upload Files option.

    4. Save and Deploy these new Artifacts

      Salary Table
  • Step 2
    1. Create a new folder called models under db/src. Create a new Calculation View via the SAP HANA: Create SAP HANA Database Artifact command pallet entry.

      New calculation view
    2. Call it SALARIES_ANONYMIZED and make it a CUBE. Press Create.

      New calculation view
    3. Select the new calculation view file in the Explorer. This will load it into the graphical Calculation View editor. Click on the Projection node and then click on the white canvas to drop it.

      New calculation view
    4. Use the

      plus sign
      on the node to add a table as a data source. Choose TGT_SALARIES

      New calculation view
  • Step 3
    1. Double click on the Projection_1 node. This will open the mapping. Double click on TGT_SALARIES to add all of the columns to the output

      New calculation view
    2. Connect the Projection_1 node to the Aggregation node.

      Configure privacy
    3. Double-click on the name of the node to move all the fields into the output columns.

      Configure privacy
    4. Go into the Semantics node and switch START_YEAR and ID to attribute

      Configure privacy
    5. Save and Deploy

  • Step 4
    1. Create a view (artifact type hdbview) in the src/data/models folder named V_SALARIES.

      Configure privacy
    2. Use the following syntax to create a SQL view that uses the Differential Privacy Anonymization approach for the SALARY column.

      SQL
      Copy
      VIEW V_SALARIES (
        ID,
        SALARY,
        START_YEAR,
        GENDER,
        REGION,
        "T-LEVEL"
      ) AS
      SELECT ID,
        SALARY,
        START_YEAR,
        GENDER,
        REGION,
        "T-LEVEL"
      FROM "SALARIES_ANONYMIZED"
      WITH READ ONLY
      WITH ANONYMIZATION (ALGORITHM 'DIFFERENTIAL_PRIVACY'
        PARAMETERS '{"data_change_strategy": "qualified"}'
        COLUMN ID PARAMETERS '{"is_sequence": true}'
        COLUMN SALARY PARAMETERS '{"is_sensitive":true, "epsilon":0.1, "sensitivity":15000}')
      

      For more information about these parameters check the SAP HANA Cloud Data Anonymization Guide

    3. If you receive an error that the feature is not supported, this can be safely ignored.

      Configure privacy
    4. Save and Deploy

  • Step 5
    1. Open the Database Explorer for your project

      Data Preview
    2. Open a SQL Console and issue the following statement

      SQL
      Copy
      refresh view V_SALARIES anonymization;
      
    3. Execute that statement in the SQL console

      Data Preview
    4. You can now use the normal Open Data on the view to preview the Raw Data.

      Data Preview
Back to top