Make Predictions for House Prices with SAP AI Core

Intermediate

20 min.

SAP AI Core, Intermediate, Tutorial, Machine Learning

Deploy AI models and set up serving pipelines to scale prediction server.

You will learn

How to create deployment server an for AI model
How to set up scaling options for your deployment server
How to swap a deployed AI model with a different new model

Dhrubajyoti PaulApril 1, 2026

Created by

July 27, 2022

Contributors

Prerequisites

A BTP global account
If you are an SAP Developer or SAP employee, please refer to the following links ( for internal SAP stakeholders only ) -
How to create a BTP Account (internal)
SAP AI Core
If you are an external developer or a customer or a partner kindly refer to this tutorial
You have connected code to the AI workflows of SAP AI Core using this tutorial.
You have trained a model using SAP AI Core, such as the house price predictor model in this tutorial, or your own model trained in your local system. If you trained your own local model, follow this tutorial to use it with SAP AI Core.
You know how to locate artifacts. This is explained in this tutorial.

You will create a deployment server for AI models to use in online inferencing. It is possible to change the names of components mentioned in this tutorial, without breaking the functionality, unless stated explicitly.

The deployment server demonstrated in this tutorial can only be used in the backend of your AI project. For security reasons, in your real set up you will not be able to directly make prediction calls from your front end application to the deployment server. Doing so will lead to an inevitable Cross-origin Resource Sharing (CORS) error. As a temporary resolution, please deploy another application between your front end application and this deployment server. This middle application should use the SAP AI Core SDK (python package) to make calls to the deployment server.

Please find downloadable sample notebooks for the tutorials : . Note that these tutorials are for demonstration purposes only and should not be used in production environments. To execute them properly, you’ll need to set up your own S3 bucket or provision services from BTP, including an AI Core with a standard plan for narrow AI and an extended plan for Generative AI Hub. Ensure you input the service keys of these services into the relevant cells of the notebook.
Link to notebook

Step 1

Create a new directory in your local system named hello-aicore-server.

Create a file named main.py, and paste the following snippet there:

PYTHON

Copy

import os
import pickle
import numpy as np
from flask import Flask
from flask import request as call_request

# Creates Flask serving engine
app = Flask(__name__)

model = None

@app.before_first_request
def init():
    """
    Load model else crash, deployment will not start
    """
    global model
    model = pickle.load(open ('/mnt/models/model.pkl','rb')) # All the model files will be read from /mnt/models
    return None

@app.route("/v2/greet", methods=["GET"])
def status():
    global model
    if model is None:
        return "Flask Code: Model was not loaded."
    else:
        return "Model is loaded."

# You may customize the endpoint, but must have the prefix `/v<number>`
@app.route("/v2/predict", methods=["POST"])
def predict():
    """
    Perform an inference on the model created in initialize

    Returns:
        String value price.
    """
    global model
    #
    query = dict(call_request.json)
    input_features = [ # list of values from request call
        query['MedInc'],
        query['HouseAge'],
        query['AveRooms'],
        query['AveBedrms'],
        query['Population'],
        query['AveOccup'],
        query['Latitude'],
        query['Longitude'],
    ]
    # Prediction
    prediction = model.predict(
        np.array([list(map(float, input_features)),]) # (trailing comma) <,> to make batch with 1 observation
    )
    output = str(prediction)
    # Response
    return output

if __name__ == "__main__":
    print("Serving Initializing")
    init()
    print(f'{os.environ["greetingmessage"]}')
    print("Serving Started")
    app.run(host="0.0.0.0", debug=True, port=9001)

Understanding your code

Where should you load your model from?

Your code reads files from folder /mnt/models. This folder path is hard-coded in SAP AI Core, and cannot be modified.
Later, you will dynamically place your model file in the path /mnt/models.
You may place multiple files inside /mnt/models as part of your model. These files may have multiple formats, such as .py or .pickle, however you should not-create sub-directories within it.

Which serving engine to use?

Your code uses Flask to create a server, however you may use another python library if you would like to.
Your format for prediction REST calls will depend on the implementation of this deployment server.
You implement the endpoint /v2/predict to make predictions. You may modify the endpoint name and format, but each endpoint must have the prefix /v<NUMBER>. For example if you want to create endpoint to greet your server, then the endpoint implementation should be /v2/greet or /v1/greet

Create file requirements.txt as shown below.

TEXT

Copy

scikit-learn==0.24.2
joblib==1.0.1
Flask==2.0.1
gunicorn==20.1.0

Choose the location that you use to read models for serving code.

Any custom location
/home
/models
/mnt/models

Step 2
Create a file called Dockerfile in the folder hello-aicore-server with the following contents. The Dockerfile contains instructions to package your code files as a single Docker image.

This filename cannot be amended, and does not have a .filetype.

TEXT
Copy
# Base layer (default dependencies) to use # You should find more base layers at https://hub.docker.com FROM python:3.7.11 ENV LANG C.UTF-8 # Custom location to place code files RUN mkdir -p /app/src COPY main.py /app/src/ COPY requirements.txt /app/src/requirements.txt RUN pip3 install -r /app/src/requirements.txt # Required to execute script RUN chgrp -R nogroup /app && \ chmod -R 770 /app

Open your terminal and navigate to your hello-aicore-server folder, using the following code.

BASH
Copy
cd hello-aicore-server

Build your Docker image by adapting the following code.

BASH
Copy
docker build -t docker.io/<YOUR_DOCKER_USERNAME>/house-server:01 .

The period . at the end of your command instructs Docker to find the Dockerfile in the current directory target by your terminal.

Upload your Docker image to your Docker repository, by adapting the following code.

BASH
Copy
docker push docker.io/<YOUR_DOCKER_USERNAME>/house-server:01
Step 3
SAP AI Core allows you to configure compute resources for serving workloads using either an instance type or a resource plan. You must specify at least one of these options.

If you specify an instance type, a resource plan is not required.

If you specify a resource plan, an instance type is not required.

YAML
Copy
labels: ai.sap.com/resourcePlan: <yourChoiceOfResourcePlace> (or) ai.sap.com/instanceType: <yourChoiceOfInstanceType>

Note:

Resource plans are suitable for most standard serving workloads.

Instance types are recommended for GPU-based or performance-critical serving scenarios.

Reference:

SAP Help Portal – Choose an Instance (SAP AI Core)

SAP Note 3660109 – Available Instance Types

If you are using public Docker registry, which URL you must use in your Docker registry secret?
https://index.docker.io
docker.io (works with training but not with serving)
https://docker.io
hub.docker.com
hub.docker.io

Step 4

Create an executable (YAML file) named house-price-server.yaml in your GitHub repository. You may use the existing GitHub path which is already tracked synced to your application of SAP AI Core.

IMPORTANT The structure(schemas) of workflows and executables are different for both training and serving in SAP AI Core. For available options for the schemas you must refer to the official help guide of SAP AI Core

YAML

Copy

apiVersion: ai.sap.com/v1alpha1
kind: ServingTemplate
metadata:
  name: server-pipeline # executable ID, must be unique across your SAP AI Core instance, for example use `server-pipeline-yourname-1234`
  annotations:
    scenarios.ai.sap.com/description: "Learning to predict house price"
    scenarios.ai.sap.com/name: "House Price (Tutorial)"
    executables.ai.sap.com/description: "Create online server to make live predictions"
    executables.ai.sap.com/name: "server"
    artifacts.ai.sap.com/housepricemodel.kind: "model" # Suggest the kind of artifact to input.
  labels:
    scenarios.ai.sap.com/id: "learning-datalines"
    ai.sap.com/version: "1.0"
spec:
  inputs:
    artifacts:
      - name: housepricemodel # placeholder name, do not add `-` in value use only alphanumeric chars
    parameters:
      - name: greetmessage # placeholder name
        type: string # required for every parameters
  template:
    apiVersion: "serving.kserve.io/v1beta1"
    metadata:
      annotations: |
        autoscaling.knative.dev/metric: concurrency   # condition when to scale
        autoscaling.knative.dev/target: 1
        autoscaling.knative.dev/targetBurstCapacity: 0
      labels: |
        ai.sap.com/resourcePlan: starter  # or ai.sap.com/instanceType: <yourChoiceOfInstanceType>
    spec: |
      predictor:
        imagePullSecrets:
          - name: credstutorialrepo   # your docker registry secret
        minReplicas: 1
        maxReplicas: 5    # how much to scale
        containers:
        - name: kserve-container
          image: "docker.io/<YOUR_DOCKER_USERNAME>/house-server:01"
          ports:
            - containerPort: 9001    # customizable port
              protocol: TCP
          command: ["/bin/sh", "-c"]
          args:
            - >
              set -e && echo "Starting" && gunicorn --chdir /app/src main:app -b 0.0.0.0:9001 # filename `main` flask variable `app`
          env:
            - name: STORAGE_URI # Required
              value: "{{inputs.artifacts.housepricemodel}}" # Required reference from artifact name, see above
            - name: greetingmessage # different name to avoid confusion
              value: "{{inputs.parameters.greetmessage}}"

Understanding your serving executable

You use an input artifacts placeholder housepricemodel for your model.
You use an input parameters placeholder greetmessage to pass any value in a string.
You configure compute resources using the ai.sap.com/resourcePlan label. In this tutorial, the starter resource plan is used for serving, as it is cost-effective for non-GPU workloads. Alternatively, you can use ai.sap.com/instanceType for advanced or GPU-enabled serving scenarios. Learn more in the SAP Help Portal – Choose an Instance.
You set the auto scaling of the server with the parameters: minReplicas and maxReplicas.
You set the serving code to use through a Docker image, and the credentials to access it via imagePullSecrets. You must ensure that if you are using a public docker registry that has the file type docker.io, your secret points to the URL https://index.docker.io. You may delete and recreate the docker registry secret. This will not affect training templates running in parallel.
You use the placeholder env to pass your inputs values as environment variables in your Docker image.
You use the model placeholder value (reference to cloud storage) STORAGE_URI through the environment variables. The model files stored in your cloud storage (referenced by the value of your input artifacts placeholder) will be copied to the path /mnt/models inside your Docker image.

Step 5
IMPORTANT An artifact is a reference to files stored in your cloud storage. A single artifact can refer to a location containing multiple files. For model artifacts, your artifact must not point to a directory which contains a subdirectory. For example, if your artifact points to s3://my/storage/of/house/modelv2, modelv2 must not contain sub-directories.
Step 6
Step 7
The prediction value is expressed in hundreds of thousands of dollars ($100,000) for this specific use case.
Step 8
Switching between deployed models means that you can update the model used in your deployment server, without affecting the deployment URL.

The Current Status of your deployment changes to Unknown while your new model is copied to the serving engine. After the deployment has been copied successfully, the status changes to Running and is ready to make new predictions.
Step 9
A running deployment incurs cost because it is allocated cloud resources. Stopping the deployment frees up these resources and therefore there is no charge for a deployment of status Stopped.

Note: You cannot restart a deployment. You must create a new deployment, reusing the configuration. Each deployment will have a different URL.
Step 10
You can check the current running Pods Using in AI Launchpad, choosing the Deployment, and clicking on Scaling tab

Similarly if you want to check for resource plan just visit the resources tab

Write code for serving engine
Bundle and publish code to cloud
Set Compute Resources for Serving - Pre Read
Create a serving executable
Select a model to deploy using a configuration
Start a deployment
Make a prediction
Switch the deployed model
Stop a deployment
Check Running Resources (optional)

Make Predictions for House Prices with SAP AI Core

Prerequisites

Understanding your code

Choose the location that you use to read models for serving code.

If you are using public Docker registry, which URL you must use in your Docker registry secret?

Understanding your serving executable

Developer Products

Trials & Downloads

Site Information