Skip to Content

Make Predictions for House Prices with SAP AI Core

Deploy AI models and set up serving pipelines to scale prediction server.
You will learn
  • How to create deployment server an for AI model
  • How to set up scaling options for your deployment server
  • How to swap a deployed AI model with a different new model
dhrubpaulDhrubajyoti PaulNovember 21, 2022
Created by
July 27, 2022


  • You have connected code to the AI workflows of SAP AI Core using this tutorial.
  • You have trained a model using SAP AI Core, such as the house price predictor model in this tutorial, or your own model trained in your local system. If you trained your own local model, follow this tutorial to use it with SAP AI Core.
  • You know how to locate artifacts. This is explained in this tutorial.

You will create a deployment server for AI models to use in online inferencing. It is possible to change the names of components mentioned in this tutorial, without breaking the functionality, unless stated explicitly.

The deployment server demonstrated in this tutorial can only be used in the backend of your AI project. For security reasons, in your real set up you will not be able to directly make prediction calls from your front end application to the deployment server. Doing so will lead to an inevitable Cross-origin Resource Sharing (CORS) error. As a temporary resolution, please deploy another application between your front end application and this deployment server. This middle application should use the SAP AI Core SDK (python package) to make calls to the deployment server.

  • Step 1

    Create a new directory in your local system named hello-aicore-server.

    Create a file named, and paste the following snippet there:

    import os
    import pickle
    import numpy as np
    from flask import Flask
    from flask import request as call_request
    # Creates Flask serving engine
    app = Flask(__name__)
    model = None
    def init():
        Load model else crash, deployment will not start
        global model
        model = pickle.load(open ('/mnt/models/model.pkl','rb')) # All the model files will be read from /mnt/models
        return None
    @app.route("/v2/greet", methods=["GET"])
    def status():
        global model
        if model is None:
            return "Flask Code: Model was not loaded."
            return "Model is loaded."
    # You may customize the endpoint, but must have the prefix `/v<number>`
    @app.route("/v2/predict", methods=["POST"])
    def predict():
        Perform an inference on the model created in initialize
            String value price.
        global model
        query = dict(call_request.json)
        input_features = [ # list of values from request call
        # Prediction
        prediction = model.predict(
            np.array([list(map(float, input_features)),]) # (trailing comma) <,> to make batch with 1 observation
        output = str(prediction)
        # Response
        return output
    if __name__ == "__main__":
        print("Serving Initializing")
        print("Serving Started")"", debug=True, port=9001)

    Understanding your code

    Where should you load your model from?

    • Your code reads files from folder /mnt/models. This folder path is hard-coded in SAP AI Core, and cannot be modified.
    • Later, you will dynamically place your model file in the path /mnt/models.
    • You may place multiple files inside /mnt/models as part of your model. These files may have multiple formats, such as .py or .pickle, however you should not-create sub-directories within it.

    Which serving engine to use?

    • Your code uses Flask to create a server, however you may use another python library if you would like to.
    • Your format for prediction REST calls will depend on the implementation of this deployment server.
    • You implement the endpoint /v2/predict to make predictions. You may modify the endpoint name and format, but each endpoint must have the prefix /v<NUMBER>. For example if you want to create endpoint to greet your server, then the endpoint implementation should be /v2/greet or /v1/greet

    Create file requirements.txt as shown below.


    Choose the location that you use to read models for serving code.

    Log in to complete tutorial
  • Step 2

    Create a file called Dockerfile in the folder hello-aicore-server with the following contents. The Dockerfile contains instructions to package your code files as a single Docker image.

    This filename cannot be amended, and does not have a .filetype.

    # Base layer (default dependencies) to use
    # You should find more base layers at
    FROM python:3.7.11
    # Custom location to place code files
    RUN mkdir -p /app/src
    COPY /app/src/
    COPY requirements.txt /app/src/requirements.txt
    RUN pip3 install -r /app/src/requirements.txt
    # Required to execute script
    RUN chgrp -R nogroup /app && \
        chmod -R 770 /app

    Open your terminal and navigate to your hello-aicore-server folder, using the following code.

    cd hello-aicore-server

    Build your Docker image by adapting the following code.

    docker build -t<YOUR_DOCKER_USERNAME>/house-server:01 .

    The period . at the end of your command instructs Docker to find the Dockerfile in the current directory target by your terminal.

    Upload your Docker image to your Docker repository, by adapting the following code.

    docker push<YOUR_DOCKER_USERNAME>/house-server:01
    Log in to complete tutorial
  • Step 3

    Create an executable (YAML file) named house-price-server.yaml in your GitHub repository. You may use the existing GitHub path which is already tracked synced to your application of SAP AI Core.

    IMPORTANT The structure(schemas) of workflows and executables are different for both training and serving in SAP AI Core. For available options for the schemas you must refer to the official help guide of SAP AI Core

    kind: ServingTemplate
      name: server-pipeline # executable ID, must be unique across your SAP AI Core instance, for example use `server-pipeline-yourname-1234`
      annotations: "Learning to predict house price" "House Price (Tutorial)" "Create online server to make live predictions" "server" "model" # Suggest the kind of artifact to input.
      labels: "learning-datalines" "1.0"
          - name: housepricemodel # placeholder name, do not add `-` in value use only alphanumeric chars
          - name: greetmessage # placeholder name
            type: string # required for every parameters
        apiVersion: ""
          annotations: |
   concurrency   # condition when to scale
          labels: |
   starter # computing power
        spec: |
              - name: credstutorialrepo   # your docker registry secret
            minReplicas: 1
            maxReplicas: 5    # how much to scale
            - name: kserve-container
              image: "<YOUR_DOCKER_USERNAME>/house-server:01"
                - containerPort: 9001    # customizable port
                  protocol: TCP
              command: ["/bin/sh", "-c"]
                - >
                  set -e && echo "Starting" && gunicorn --chdir /app/src main:app -b # filename `main` flask variable `app`
                - name: STORAGE_URI # Required
                  value: "{{inputs.artifacts.housepricemodel}}" # Required reference from artifact name, see above
                - name: greetingmessage # different name to avoid confusion
                  value: "{{inputs.parameters.greetmessage}}"

    Understanding your serving executable

    1. You use an input artifacts placeholder housepricemodel for your model.
    2. You use an input parameters placeholder greetmessage to pass any value in a string.
    3. You use the starter computing resource plan with To start, using a non-GPU based resource plan for serving (like starter) is cost effective. Find out more about available resource plans in the help portal.
    4. You set the auto scaling of the server with the parameters: minReplicas and maxReplicas.
    5. You set the serving code to use through a Docker image, and the credentials to access it via imagePullSecrets. You must ensure that if you are using a public docker registry that has the file type, your secret points to the URL You may delete and recreate the docker registry secret. This will not affect training templates running in parallel.
    6. You use the placeholder env to pass your inputs values as environment variables in your Docker image.
    7. You use the model placeholder value (reference to cloud storage) STORAGE_URI through the environment variables. The model files stored in your cloud storage (referenced by the value of your input artifacts placeholder) will be copied to the path /mnt/models inside your Docker image.

    If you are using public Docker registry, which URL you must use in your Docker registry secret?

    Log in to complete tutorial
  • Step 4

    IMPORTANT An artifact is a reference to files stored in your cloud storage. A single artifact can refer to a location containing multiple files. For model artifacts, your artifact must not point to a directory which contains a subdirectory. For example, if your artifact points to s3://my/storage/of/house/modelv2, modelv2 must not contain sub-directories.

    Log in to complete tutorial
  • Step 5
    Log in to complete tutorial
  • Step 6

    The prediction value is expressed in hundreds of thousands of dollars ($100,000) for this specific use case.

    Log in to complete tutorial
  • Step 7

    Switching between deployed models means that you can update the model used in your deployment server, without affecting the deployment URL.

    The Current Status of your deployment changes to Unknown while your new model is copied to the serving engine. After the deployment has been copied successfully, the status changes to Running and is ready to make new predictions.

    Log in to complete tutorial
  • Step 8

    A running deployment incurs cost because it is allocated cloud resources. Stopping the deployment frees up these resources and therefore there is no charge for a deployment of status Stopped.

    Note: You cannot restart a deployment. You must create a new deployment, reusing the configuration. Each deployment will have a different URL.

    Log in to complete tutorial
Back to top