Skip to Content

Import an Association Rules Sample Dataset

test
0 %
Import an Association Rules Sample Dataset
Details
// Explore More Tutorials

Import an Association Rules Sample Dataset

09/10/2018

Import SAP Predictive Analytics Association Rules Sample Dataset in your SAP HANA, express edition instance

You will learn

In this tutorial, you will learn how to download and import the SAP Predictive Analytics Association Rules sample dataset into your SAP HANA, express edition instance.

Step 1: Sample file structure

The sample structure for the Sample Data for Association Rules is the following:

|-- sample_association_rules_3.3.1_en-us_production.zip
	|-- Association_Rules.zip
	| 	|-- Association_Rules
	| 	|-- customers_references.txt
	| 	|-- customers_references_desc.txt
	| 	|-- customers_transactions.txt
	| 	|-- customers_transactions_desc.txt
	| 	|-- website_references.csv
	| 	|-- website_references_desc.csv
	| 	|-- website_transactions.csv
	| 	|-- website_transactions_desc.csv
	| 	|-- readme.txt / lisezmoi.txt
	|-- metadata.xml

Note: desc and KxDesc files are SAP Predictive Analytics dataset description files and will not be loaded.

Customers orders

This data set presents website purchase, and includes a reference file (customers_references.txt) describing static information on customers and a transaction file (customers_transactions.txt) with the customer purchases.

  • Customers references
Variable Description Example of Values
UserID the customer key Any numerical value
Country the customer country ITALY, FRANCE, …
Site the site domain net, com, .fr
IDProv Level of study A value between 1 and 16
OrdersCount the number of orders posted Any numerical value
  • Customers transactions
Variable Description Example of Values
UserID the customer key Any numerical value
ItemPurchased the item purchased A product category
Date_PutInCaddy the order date A date
Quantity the ordered quantity Any numerical value
TransactionID the order transaction id Any numerical value

Website navigation

This data set presents website navigation and includes a reference file (website_references.txt) including the session identifiers sessions and a transaction file (website_transactions.txt) with the list of page views associated with a session.

  • Session references
Variable Description Example of Values
SessionID the session id Any numerical value
  • Session transactions
Variable Description Example of Values
SessionID the session id Any numerical value
Page the visited page A page name
TransactionID the associated transaction id A date
IPAddress the visitor IP address An IP address
Time the date/time of the visit A date/time
Log on to answer question
Step 2: Create the table structure

Connect to the HXE tenant using the ML_USER user credentials and execute the following SQL statement to create the table structure:

CREATE TABLE PA_DATA.CUSTOMERS_REFERENCES (
 USERID INT,
 COUNTRY VARCHAR(20),
 SITE VARCHAR(3),
 IDPROV INT,
 ORDERS_COUNT INT,
 PRIMARY KEY (USERID)
);
CREATE TABLE PA_DATA.CUSTOMERS_TRANSACTIONS (
 USERID INT,
 ITEMPURCHASED VARCHAR(20),
 DATE_PUTINCADDY TIMESTAMP,
 QUANTITY VARCHAR(5),
 TRANSACTIONID INT,
 PRIMARY KEY (TRANSACTIONID)
);
CREATE TABLE PA_DATA.WEBSITE_REFERENCES (
 SESSIONID INT,
 PRIMARY KEY (SESSIONID)
);
CREATE TABLE PA_DATA.WEBSITE_TRANSACTIONS (
 SESSIONID INT,
 PAGE VARCHAR(50),
 TRANSACTIONID INT,
 IPADDRESS VARCHAR(15),
 VISITE_TIME TIMESTAMP,
 PRIMARY KEY (TRANSACTIONID)
);
Log on to answer question
Step 3: Download the Sample Data

Open the SAP Predictive Analytics documentation page in a browser and click on the View All for the Sample section.

This will display the list of sample dataset available.

Right-click on the Download the Sample Data for Association Rules link and use the Save link address to get the download URL.

Open the download URL and save the sample data archive either:

  • the Eclipse host if you want to use the SAP HANA Tools for Eclipse
  • the SAP HANA, express host if you want the IMPORT FROM SQL command

Extract the files (located in the Association_Rules subdirectory in the embedded Association_Rules.zip archive).

Import Using the SAP HANA Tools for Eclipse

You can extract the sample file anywhere you want on the Eclipse host.

You can now move to Step 3: Import Using the SAP HANA Tools for Eclipse.

Import Using the IMPORT FROM SQL command

Here is an example script that you can reuse to download and extract the dataset directly from the SAP HANA, express edition host:

URL=https://help.sap.com/http.svc/download?deliverable_id=20555031
OUTPUT_FILE=sample_association
OUTPUT_DIR=/usr/sap/HXE/HDB90/work/$OUTPUT_FILE

 # create a new subdirectory for the sample data
mkdir $OUTPUT_DIR

 # download the archive in the sample data directory
 # wget -O $OUTPUT_DIR/$OUTPUT_FILE.zip $URL
curl $URL -o $OUTPUT_DIR/$OUTPUT_FILE.zip

 # switch to the new directory
cd $OUTPUT_DIR

 # extract all archives and embedded archives
while [ "`find . -type f -name '*.zip' | wc -l`" -gt 0 ]; \
 do find -type f -name "*.zip" \
 -exec unzip -o -- '{}' \; \
 -exec rm -- '{}' \;; done

It requires CURL (or WGET) to be installed.

The dataset files should now be located in: /usr/sap/HXE/HDB90/work/sample_association/Association_Rules

You can now move to Step 3: Import Using the IMPORT FROM SQL command.

Log on to answer question
Step 4: Import Using the SAP HANA Tools for Eclipse

You can import the data using the SAP HANA Tools for Eclipse Import feature using the following details:

  • File Details:

  • Set the Field Delimiter value as listed in the table bellow.

  • Check the Header row exists box and set the Header row value to 1
  • Check the Import all data box.
  • Manage Table Definition and Data Mappings:

Using the Mapping menu icon, select the One by One option.

The source files should be mapped with the following target tables:

Source File Target Table Field Delimiter
customers_references.txt PA_DATA.CUSTOMERS_REFERENCES Tab (\t)
customers_transactions.txt PA_DATA.CUSTOMERS_TRANSACTIONS Tab (\t)
website_references.txt PA_DATA.WEBSITE_REFERENCES Semi Colon (;)
website_transactions.txt PA_DATA.WEBSITE_TRANSACTIONS Semi Colon (;)
Log on to answer question
Step 5: Import Using the IMPORT FROM SQL command

The dataset files should be located in: /usr/sap/HXE/HDB90/work/sample_association/Association_Rules

Connect to the HXE tenant using the ML_USER user credentials using your SQL query tool.

Execute the following SQL statement:

IMPORT FROM CSV FILE '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/customers_references.txt' INTO PA_DATA.CUSTOMERS_REFERENCES
WITH
 RECORD DELIMITED BY '\n'
 FIELD DELIMITED BY '\t'
 OPTIONALLY ENCLOSED BY '"'
 SKIP FIRST 1 ROW
 FAIL ON INVALID DATA
 ERROR LOG '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/customers_references.txt.err'
;
IMPORT FROM CSV FILE '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/customers_transactions.txt' INTO PA_DATA.CUSTOMERS_TRANSACTIONS
WITH
 RECORD DELIMITED BY '\n'
 FIELD DELIMITED BY '\t'
 OPTIONALLY ENCLOSED BY '"'
 SKIP FIRST 1 ROW
 FAIL ON INVALID DATA
 ERROR LOG '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/customers_transactions.txt.err'
;
IMPORT FROM CSV FILE '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/website_references.csv' INTO PA_DATA.WEBSITE_REFERENCES
WITH
 RECORD DELIMITED BY '\n'
 FIELD DELIMITED BY ';'
 OPTIONALLY ENCLOSED BY '"'
 SKIP FIRST 1 ROW
 FAIL ON INVALID DATA
 ERROR LOG '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/website_references.csv.err'
;
IMPORT FROM CSV FILE '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/website_transactions.csv' INTO PA_DATA.WEBSITE_TRANSACTIONS
WITH
 RECORD DELIMITED BY '\n'
 FIELD DELIMITED BY ';'
 OPTIONALLY ENCLOSED BY '"'
 SKIP FIRST 1 ROW
 TIMESTAMP FORMAT 'YYYY-MM-DD:HH24:MI:SS'
 FAIL ON INVALID DATA
 ERROR LOG '/usr/sap/HXE/HDB90/work/sample_association/Association_Rules/website_transactions.csv.err'
;
Log on to answer question
Step 6: Verify the imported data

You can verify that the data was imported properly using the following SQL statement:

SELECT 'CUSTOMERS_REFERENCES' as TABLE_NAME, count(1) as ROW_COUNT from PA_DATA.CUSTOMERS_REFERENCES
UNION
SELECT 'CUSTOMERS_TRANSACTIONS' as TABLE_NAME, count(1) as ROW_COUNT from PA_DATA.CUSTOMERS_TRANSACTIONS
UNION
SELECT 'WEBSITE_REFERENCES' as TABLE_NAME, count(1) as ROW_COUNT from PA_DATA.WEBSITE_REFERENCES
UNION
SELECT 'WEBSITE_TRANSACTIONS' as TABLE_NAME, count(1) as ROW_COUNT from PA_DATA.WEBSITE_TRANSACTIONS
;
Log on to answer question
Step 7: Validation

Provide an answer to the question below then click on Validate.

How many rows were imported from the file in the CUSTOMERS_TRANSACTIONS table?
×

Next Steps

Back to top