Generate Metrics and Compare Models in SAP AI Core
- How to log simple metrics on validation data and training data
- How to log step information along with metrics
- How to log custom metrics structure
Prerequisites
- You have an understanding of using data and generating models in SAP AI Core, from this tutorial
In this tutorial, you’ll use SAP AI Launchpad to compare two models that have been generated using SAP AI Core. This tutorial builds on the previous tutorials on house price prediction and ingesting data.
Important: Comparing models is only available using SAP AI Launchpad, and not the API endpoints. The comparison step is optional.
- Step 1
Create a folder named
hello-aicore-metrics
. Within it, create a file calledmain.py
, and paste the following starter code to this file.PYTHONCopyimport os from sklearn.tree import DecisionTreeRegressor from sklearn.model_selection import KFold, train_test_split from sklearn.inspection import permutation_importance from datetime import datetime import pandas as pd from ai_core_sdk.models import Metric, MetricTag, MetricCustomInfo, MetricLabel # # Logging Metrics: SAP AI Core connection (Step 2) # <PASTE CODE HERE> # # Variables DATA_PATH = '/app/data/train.csv' DT_MAX_DEPTH= int(os.getenv('DT_MAX_DEPTH')) MODEL_PATH = '/app/model/model.pkl' # # Load Datasets df = pd.read_csv(DATA_PATH) X = df.drop('target', axis=1) y = df['target'] # # Metric Logging: Basic (Step 3) # <PASTE CODE HERE> # # Partition into Train and test dataset train_x, test_x, train_y, test_y = train_test_split(X, y, test_size=0.3) # # K-fold kf = KFold(n_splits=5, random_state=31, shuffle=True) i = 0 # storing step count for train_index, val_index in kf.split(train_x): i += 1 # Train model on subset clf = DecisionTreeRegressor(max_depth=DT_MAX_DEPTH, random_state=31) clf.fit(train_x.iloc[train_index], train_y.iloc[train_index]) # Score on validation data (hold-out dataset) val_step_r2 = clf.score(train_x.iloc[val_index], train_y.iloc[val_index]) # Metric Logging: Step Information (Step 4) # <PASTE CODE HERE> # Delete step model del(clf) # # Final Model clf = DecisionTreeRegressor(max_depth=DT_MAX_DEPTH, random_state=31) clf.fit(train_x, train_y) # Scoring over test data test_r2_score = clf.score(test_x, test_y) # Metric Logging: Attaching to metrics to generated model (Step 5) # <PASTE CODE HERE> # # Model Explaination r = permutation_importance( clf, test_x, test_y, n_repeats=30, random_state=0 ) # Feature importances feature_importances = str('') for i in r.importances_mean.argsort()[::-1]: feature_importances += f"{df.columns[i]}: {r.importances_mean[i]:.3f} +/- {r.importances_std[i]:.3f} \n" # Metric Logging: Custom Structure (Step 6) # <PASTE CODE HERE> # # Save model import pickle pickle.dump(clf, open(MODEL_PATH, 'wb')) # # Metric Logging: Tagging the execution (Step 7) # <PASTE CODE HERE>
The snippet includes some placeholders that state
# <PASTE CODE HERE>
. You’ll complete these entries throughout the tutorial. For clarity, the comments in the code also include the relevant step number.This Python script contains all of the modifications needed for logging metrics, meaning that you can leave your previous workflows as they are.
- Step 2
Add the following code snippet.
PYTHONCopyfrom ai_core_sdk.tracking import Tracking aic_connection = Tracking() ...
CAUTION: This code snippet is very similar to the code used to connect the SAP AI Core SDK to your local system. However, in this case, it is required to establish a connection within the SAP AI Core execution environment.
- Step 3
Add the following snippet to log the number of observations in your dataset.
PYTHONCopy# from ai_core_sdk.models import Metric aic_connection.log_metrics( metrics = [ Metric( name= "N_observations", value= float(df.shape[0]), timestamp=datetime.utcnow()), ] )
After execution, this is shown in SAP AI Launchpad. You can zoom in for details.
- Step 4
Add the following snippet to store metrics for step information. This snippet is also useful for tracking the metrics on epochs of the training process.
PYTHONCopyaic_connection.log_metrics( metrics = [ Metric(name= "(Val) Fold R2", value= float(val_step_r2), timestamp=datetime.utcnow(), step=i), ] )
The variable
i
is already present in your code to pass to the parameterstep=i
. This is also shown in SAP AI Launchpad. You can zoom in to view. - Step 5
Add the following snippet to store metrics for artifact information.
PYTHONCopyaic_connection.log_metrics( metrics = [ Metric( name= "Test data R2", value= float(test_r2_score), timestamp=datetime.utcnow(), labels= [ MetricLabel(name="metrics.ai.sap.com/Artifact.name", value="housepricemodel") ] ) ] )
The parameter
value="housepricemodel"
refers to the artifact name, which references the model that will be stored in AWS S3. The name of this parameter must match the name that you defined in your YAML workflow.Your code should resemble:
After execution, this is shown in SAP AI Launchpad.
- Step 6
Add the following snippet to store metrics based on a customized structure.
PYTHONCopyaic_connection.set_custom_info( custom_info= [ MetricCustomInfo(name= "Feature Importance (verbose)", value= str(r)), MetricCustomInfo(name= "Feature Importance (brief)", value= feature_importances ) ] )
The structure must be type-cast to
str
(a string). Here, the structure used is permutation feature importance.The variables
r
andfeature_importances
are already created in the starter code.After execution, you can see this in SAP AI Launchpad.
Permutation Feature Importance
Sources:
Scikit Learn Python Package
What it is?
- Shows how much a model depends on a given feature for a given target, model, dataset and task.
- Gives an empirical estimate of how much loss is attributed to the removal of a given feature.
What it is not:
- A model, dataset or task-agnostic indication of the importance of a given feature. While the method is agnostic, the results are applicable only to the specific input combination.
- An accurate indication of the importance of a given feature for a specific prediction. Although this is the goal of the method, it does not account for weaknesses in the model.
Advantages:
- Model agnostic.
- Provides global
explainability
- meaning that it estimates each feature’s importance to the prediction task. - Contributes to model transparency,
- The method or function used to measure “error” can be customized, with reference to
scikit
package implementation.
- Step 7
Add the following snippet to tag your execution. The
tags
are customizable key-values.PYTHONCopyaic_connection.set_tags( tags= [ MetricTag(name="Validation Method Used", value= "K-Fold"), # your custom name and value MetricTag(name="Metrics", value= "R2"), ] )
After execution, you can see this in SAP AI Launchpad.
- Step 8
Check your modified
main.py
by comparing it with the following expectedmain.py
.PYTHONCopyimport os from sklearn.tree import DecisionTreeRegressor from sklearn.model_selection import KFold, train_test_split from sklearn.inspection import permutation_importance from datetime import datetime import pandas as pd from ai_core_sdk.models import Metric, MetricTag, MetricCustomInfo, MetricLabel # from ai_core_sdk.tracking import Tracking aic_connection = Tracking() # # Variables DATA_PATH = '/app/data/train.csv' DT_MAX_DEPTH= int(os.getenv('DT_MAX_DEPTH')) MODEL_PATH = '/app/model/model.pkl' # # Load Datasets df = pd.read_csv(DATA_PATH) X = df.drop('target', axis=1) y = df['target'] # # Metric Logging: Basic aic_connection.log_metrics( metrics = [ Metric( name= "N_observations", value= float(df.shape[0]), timestamp=datetime.utcnow()), ] ) # # Partition into Train and test dataset train_x, test_x, train_y, test_y = train_test_split(X, y, test_size=0.3) # # K-fold kf = KFold(n_splits=5, random_state=31, shuffle=True) i = 0 # storing step count for train_index, val_index in kf.split(train_x): i += 1 # Train model on subset clf = DecisionTreeRegressor(max_depth=DT_MAX_DEPTH, random_state=31) clf.fit(train_x.iloc[train_index], train_y.iloc[train_index]) # Score on validation data (hold-out dataset) val_step_r2 = clf.score(train_x.iloc[val_index], train_y.iloc[val_index]) # Metric Logging: Step Information aic_connection.metrics.log_metrics( metrics = [ Metric(name= "(Val) Fold R2", value= float(val_step_r2), timestamp=datetime.utcnow(), step=i), ] ) # Delete step model del(clf) # # Final Model clf = DecisionTreeRegressor(max_depth=DT_MAX_DEPTH, random_state=31) clf.fit(train_x, train_y) # Scoring over test data test_r2_score = clf.score(test_x, test_y) # Metric Logging: Attaching to metrics to generated model aic_connection.log_metrics( metrics = [ Metric( name= "Test data R2", value= float(test_r2_score), timestamp=datetime.utcnow(), labels= [ MetricLabel(name="metrics.ai.sap.com/Artifact.name", value="housepricemodel") ] ) ] ) # # Model Explaination r = permutation_importance( clf, test_x, test_y, n_repeats=30, random_state=0 ) # Feature importances feature_importances = str('') for i in r.importances_mean.argsort()[::-1]: feature_importances += f"{df.columns[i]}: {r.importances_mean[i]:.3f} +/- {r.importances_std[i]:.3f} \n" # Metric Logging: Custom Structure aic_connection.set_custom_info( custom_info= [ MetricCustomInfo(name= "Feature Importance (verbose)", value= str(r)), MetricCustomInfo(name= "Feature Importance (brief)", value= feature_importances ) ] ) # # Save model import pickle pickle.dump(clf, open(MODEL_PATH, 'wb')) # # Metric Logging: Tagging the execution aic_connection.set_tags( tags= [ MetricTag(name="Validation Method Used", value= "K-Fold"), # your custom name and value MetricTag(name="Metrics", value= "R2"), ] )
Check your modified
requirements.txt
by comparing it with the following expectedrequirements.txt
.TEXTCopysklearn==0.0 pandas ai-core-sdk>=1.15.1
Create a file called
Dockerfile
with the following snippet. This file must not have a file extension or alternative name.TEXTCopy# Specify which base layers (default dependencies) to use # You may find more base layers at https://hub.docker.com/ FROM python:3.7 # # Creates directory within your Docker image RUN mkdir -p /app/src/ # Don't place anything in below folders yet, just create them RUN mkdir -p /app/data/ RUN mkdir -p /app/model/ # # Copies file from your Local system TO path in Docker image COPY main.py /app/src/ COPY requirements.txt /app/src/ # # Installs dependencies within you Docker image RUN pip3 install -r /app/src/requirements.txt # # Enable permission to execute anything inside the folder app RUN chgrp -R 65534 /app && \ chmod -R 777 /app
Use the following commands to build your Docker image and push the contents to the cloud.
BASHCopydocker build -t <YOUR_DOCKER_REGISTRY>/<YOUR_DOCKER_USERNAME>/house-price:04 . docker push <YOUR_DOCKER_REGISTRY>/<YOUR_DOCKER_USERNAME>/house-price:04
Paste the following snippet to a file named
hello-metrics.yaml
in your GitHub repository. Edit it with you own Docker registry secret and username. This file is your AI workflow file.YAMLCopyapiVersion: argoproj.io/v1alpha1 kind: WorkflowTemplate metadata: name: house-metrics-train # executable id, must be unique across all your workflows (YAML files) annotations: scenarios.ai.sap.com/description: "Learning how to ingest data to workflows" scenarios.ai.sap.com/name: "House Price (Tutorial)" # Scenario name should be the use case executables.ai.sap.com/description: "Generate metrics" executables.ai.sap.com/name: "training-metrics" # Executable name should describe the workflow in the use case artifacts.ai.sap.com/housedataset.kind: "dataset" # Helps in suggesting the kind of artifact that can be attached. artifacts.ai.sap.com/housemodel.kind: "model" # Helps in suggesting the kind of artifact that can be generated. labels: scenarios.ai.sap.com/id: "learning-datalines" ai.sap.com/version: "2.0" spec: imagePullSecrets: - name: credstutorialrepo # your docker registry secret entrypoint: mypipeline arguments: parameters: # placeholder for string like inputs - name: DT_MAX_DEPTH # identifier local to this workflow templates: - name: mypipeline steps: - - name: mypredictor template: mycodeblock1 - name: mycodeblock1 inputs: artifacts: # placeholder for cloud storage attachements - name: housedataset # a name for the placeholder path: /app/data/ # where to copy in the Dataset in the Docker image outputs: artifacts: - name: housepricemodel # name of the artifact generated, and folder name when placed in S3, complete directory will be `../<executaion_id>/housepricemodel` globalName: housemodel # local identifier name to the workflow, also used above in annotation path: /app/model/ # from which folder in docker image (after running workflow step) copy contents to cloud storage archive: none: # specify not to compress while uploading to cloud {} container: image: docker.io/<YOUR_DOCKER_USERNAME>/house-price:04 # Your docker image name command: ["/bin/sh", "-c"] env: - name: DT_MAX_DEPTH # name of the environment variable inside Docker container value: "{{workflow.parameters.DT_MAX_DEPTH}}" # value to set from local (to workflow) variable DT_MAX_DEPTH args: - "python /app/src/main.py"
- Step 9
Create a configuration using the following values. The values are taken from the workflow from previous steps. For help creating a configuration, see step 11 of this tutorial.
Value Configuration Name House Price (Jan) metrics Scenario Name House Price (Tutorial)
Version 2.0
Executable Name training-metrics
Scenario ID learning-datalines
Executable ID house-metrics-train
The value for
Input Parameters
DT_MAX_DEPTH
is your choice. Until now, this has been set using an environment variable. If a variable is not specified, this parameter continues to be defined by the environment variables.Information: This parameter can be defined using an integer as a string, to set a maximum depth or as
None
, which means that nodes are expanded until all leaves are single nodes, or contain all contain fewer data points than specified in themin_samples_split samples
, if specified. For more information, see the Scikit learn documentation.Attach your registered artifact to
Input Artifact
, by specifyinghousedataset
for this value.Create an execution from this configuration.
- Step 10
- Step 11
Create two configurations: one with
DT_MAX_DEPTH = 3
and the other withDT_MAX_DEPTH = 6
, then create executions for both configurations.You can then compare metrics for the executions using the two different configurations.
Which of the following templates will best help you locate model files in S3, that have been generated by a workflow?