Skip to Content

Build a House Price Predictor with SAP AI Core

Build portable AI code with Docker and use it with SAP AI Core.
You will learn
  • How to build Docker images from code.
  • How to use Docker images with SAP AI Core
  • How to check and debug execution logs for errors.
dhrubpaulDhrubajyoti PaulNovember 14, 2022
Created by
helenaaaaaaaaaa
May 24, 2022
Contributors
LunaticMaestro
helenaaaaaaaaaa
maximilianone

Prerequisites

  • You have created your first workflow with SAP AI Core, using this tutorial

By the end of the tutorial you will have your AI code in form of a Docker image, connected to your AI workflow. You will also know how to use Docker images together with AI core and how to debug your code if the Execution goes to an unexpected state. This tutorial is a precursor to the set up of data pipelines and model generation.

You may still complete this tutorial if you are not familiar with the Python programming language.

  • Step 1

    You need a Docker repository to store your AI code on the cloud in the form of Docker images. SAP AI core will fetch your code from this Docker repository. The image ensures that your code is bundled along with all of the dependencies, directory structure and drivers that are required when using GPU.

    INFORMATION You may use your organization’s own Docker registry/repository. But please ensure that the repository is internet facing, not protected by a firewall.

    Sign Up for a Docker account.

    Click on the profile button (your profile name) and then select Account Settings.

    image

    Select Security from the navigation bar and click New Access Token.

    image

    Follow the guided steps, and then store the token that you receive in your local system.

    SECURITY TIP: This access token means that SAP AI Core can access the specified Docker repository without you sharing you Docker credentials. It also means that you can revoke access by deleting the token, rather than having to change your credentials.

  • Step 2

    Download and Install Docker Desktop. You will need Docker Desktop to help you build Docker images of your AI code.

    Run your Docker Desktop. You will observe “whale icon” on your tray when Docker Desktop is running.

    image
  • Step 3

    Create a directory (folder) named hello-aicore-code.

    Create a file main.py. Paste the following snippet in the file.

    PYTHON
    Copy
    # Load Datasets
    from sklearn import datasets
    data_house = datasets.fetch_california_housing()
    X = data_house['data']
    y = data_house['target']
    #
    # Partition into Train and test dataset
    from sklearn.model_selection import train_test_split
    train_x, test_x, train_y, test_y = train_test_split(X, y, test_size=0.3)
    #
    # Init model
    from sklearn.tree import DecisionTreeRegressor
    clf = DecisionTreeRegressor()
    #
    # Train model
    clf.fit(train_x, train_y)
    #
    # Test model
    test_r2_score = clf.score(test_x, test_y)
    # Output will be available in logs of SAP AI Core.
    # Not the ideal way of storing /reporting metrics in SAP AI Core, but that is not the focus this tutorial
    print(f"Test Data Score {test_r2_score}")
    
    image

    Create another file requirements.txt in the same directory. Here you will mention which python libraries are required to execute your code.

    RECOMMENDED In production you should use the terminal command pip list --format freeze > requirements.txt to auto generate requirements.txt.

    Paste the following snippet into requirements.txt.

    TEXT
    Copy
    sklearn==0.0
    
    image

    The code builds a model using the California Housing Dataset available in Scikit-Learn. Note that the code neither reads a datafile nor stores the model. We will cover both of these in a different tutorial.

  • Step 4

    In the same directory, create a file named Dockerfile with no extension. This file stores instructions for Docker to build an image. Your Docker image is a Linux distribution, therefore commands in this Dockerfile are similar to Linux commands with verbs for Docker as the suffix. Paste the following content exactly as it is, into the file:

    TEXT
    Copy
    # Specify which base layers (default dependencies) to use
    # You may find more base layers at https://hub.docker.com/
    FROM python:3.7
    #
    # Creates directory within your Docker image
    RUN mkdir -p /app/src/
    #
    # Copies file from your Local system TO path in Docker image
    COPY main.py /app/src/
    COPY requirements.txt /app/src/
    #
    # Installs dependencies within you Docker image
    RUN pip3 install -r /app/src/requirements.txt
    #
    # Enable permission to execute anything inside the folder app
    RUN chgrp -R 65534 /app && \
        chmod -R 777 /app
    
    image

    You may notice that you did not specify the command to run the script main.py in the Dockerfile. This command will be written into the AI workflow and is covered later in this tutorial.

    Open your terminal and navigate to your hello-aicore-code directory. You will use the terminal to build your Docker image.

    image

    Copy and edit the following command to build your docker image. The command follows the format docker build -t <DOCKER_REGITRY>/<YOUR_DOCKER_USERNAME>/<IMAGE_NAME>:<TAG_NAME>. So for example, if you are using your organization’s registry which has the URL myteam.myorg, The command should be docker build -t myteam.myorg/yourusername/house-price:01 .

    BASH
    Copy
    docker buildx build –load --platform=<YOUR_DOCKER_PLATFORM>  -t docker.io/<YOUR_DOCKER_USERNAME>/house-price:01 .
    

    INFORMATION In the command, -t indicates that there is a tag name, followed by a colon and version. The name is your descriptive string, and the version can be in any format, here house-price and 01, respectively. The . (dot) at the end instructs Docker to look for the filename Dockerfile in the present directory.

    INFORMATION The platform information relates to your operating system, for example linux/amd64.

    The result of this command should be:

    image

    A Docker Image will bundle the following:

  • Step 5

    Login to your Docker account from your terminal. This is a one time step that stores your Docker account credentials in your local Docker Desktop.

    INFORMATION If you are using your organization docker registry (hosting) please use the command in the format docker login <URL_YOUR_ORGANIZATIONS_DOCKER_REGISTRY>

    BASH
    Copy
    docker login docker.io
    

    Copy and paste your generated Docker Access Token to use as your password. For security reasons, your input will not be printed on the screen.

    image
  • Step 6
    BASH
    Copy
    docker push docker.io/<YOUR_USERNAME>/house-price:01
    
    image
  • Step 7

    This step is required once. Storing Docker credentials enables SAP AI Core to pull (download) your Docker images from a private Docker repository. Use of a private Docker image prevents others from seeing your content.

    WARNING SAP AI Core does not verify your docker credentials, please ensure that you are storing the correct credentials.

    1. Name: Enter credstutorialrepo. This is becomes an identifier for your Docker credentials within SAP AI Core. This value is your docker registry secret.
    2. URL: If you have used your organization’s Docker registry then use its URL, otherwise, enter https://index.docker.io.
    3. Username: Your Docker username.
    4. Access Token: The access token generated previously, in the Docker account settings.

    TIP You can store multiple Docker credentials in SAP AI Core.

    A Docker registry secret in SAP AI Core stores the following:

  • Step 8

    This step requires the GitHub folder that you synced in this tutorial. In this folder, create another YAML file called code-pipeline.yaml. (This filename is not used as an identifier within SAP AI Core.)

    image

    Paste the following snippet into your YAML file. Edit the highlighted lines, using the comments and your own Docker image information. Click Commit Changes. The code is also available by following this link

    YAML
    Copy
    apiVersion: argoproj.io/v1alpha1
    kind: WorkflowTemplate
    metadata:
      name: code-pipeline # executable id, must be unique across all your workflows (YAML files), please modify this to any value (e.g. code-pipeline-12345) if you are not the only user of your SAP AI Core instance.
      annotations:
        scenarios.ai.sap.com/description: "Tutorial to add custom code to SAP AI Core"
        scenarios.ai.sap.com/name: "Code (Tutorial)"
        executables.ai.sap.com/description: "Trains model on median house prices"
        executables.ai.sap.com/name: "House Price (Sklearn Example)"
      labels:
        scenarios.ai.sap.com/id: "learning-code"
        ai.sap.com/version: "1.0"
    spec:
      imagePullSecrets:
        - name: credstutorialrepo # your docker registry secret
      entrypoint: mypipeline
      templates:
      - name: mypipeline
        steps:
        - - name: mypredictor
            template: mycodeblock1
    
      - name: mycodeblock1
        container:
          image: docker.io/<YOUR_DOCKER_USERNAME>/house-price:01 # Your docker image name
          command: ["/bin/sh", "-c"]
          args:
            - "python /app/src/main.py"
    
  • Step 9

    Observe the difference between the hello.yaml (created in the prerequisite tutorial) and code-pipeline.yaml.

    image
    1. imagePullSecrets: A key that specifies which credentials will be used to access the Docker registry. The value credstutorialrepo specifies the Docker registry secret that you created previously to store Docker information in SAP AI Core.
    2. image: A key that specifies which code to use in the workflow and which commands to execute within the Docker image.
  • Step 10

    How is a different Scenario created even when the application is syncing the your GitHub folder?

    The application’s only task is to sync and look for syntax errors in YAML files. By modifying annotations (scenarios.ai.sap.com/id, scenarios.ai.sap.com/name and scenarios.ai.sap.com/description) within the YAML you create new scenario.

  • Step 11

    The execution will go from UNKOWN to RUNNING then to the DEAD state. Resolving this is covered in next step.

  • Step 12

    We can see that scikit-learn was unable to create directory for caching. Let’s resolve this in the next step.

  • Step 13

    Update the highlighted line in main.py. Set the parameter data_home of function dataset.fetch_california_housing to /app/src. You had already set elevated permissions for this directory in your Dockerfile.

    image

    Build your code again, this time with a new tag. We use 02, i.e. the second version of your Docker image.

    BASH
    Copy
    docker build -t docker.io/<YOUR_DOCKER_USERNAME>/house-price:02 .
    
    image

    Upload your code to your Docker registry.

    BASH
    Copy
    docker upload docker.io/<YOUR_DOCKER_USERNAME>/house-price:02
    
    image
  • Step 14

    Locate your workflow (YAML file) in GitHub. Click on the Pencil Icon to edit your workflow. The original code is also available by following this link

    image

    Update your workflow by adding the new 02 tag to the ai.sap.com/version and the Docker image name.

    image

    Click Commit Changes after editing.

    Why update the version each time you make changes to your workflow?

    The executable version is denoted by ai.sap.com/version. SAP AI Core syncs you workflows every three minutes and the version number is easily observable. You can be confident that your changes have synced. An alternative method of checking is to check that the latest REVISION number from GitHub is reflected in AI Core.

    Which definition most closely describes the term "Execution"?

  • Step 15
Back to top