Skip to Content

Use the Optical Character Recognition (OCR) API from a REST Client

0 %
Use the Optical Character Recognition (OCR) API from a REST Client

Use the Optical Character Recognition (OCR) API from a REST Client

Discover how to call the Optical Character Recognition (OCR) API from a REST Client like Postman

You will learn

  • Call an API from a REST client like Postman
  • The basics about Machine Learning Foundation Service for Optical Character Recognition (OCR)

Step 1: The Optical Character Recognition Service

The Optical Character Recognition (OCR) service recognizes typewritten text from scanned or digital documents.

Differences with the OCR service
In comparison to the Optical Character Recognition service, the Scene Text Recognition service offers

  • Works with real-life color images
  • Ability to work with font-less text
  • Extract word-art / picturized text
  • Works in different orientations of texts
  • Text occurring in natural images like low-contrast, emboss/engrave

When the formats from which the text has to be read are documents or print media scans, the OCR service should be used whereas in case of natural images (e.g. reading the counter of a utility meter or the number-plate of an automobile), the Scene Text Recognition service should be used.

This is the list of accepted file extensions:

Name Description
Archive file zip, tar, gz, tgz
Image file jpg, png, pdf

The images should be RGB, or 8-bit gray scale.

If an archive file is provided, no additional files can be provided.

The input file (or the archive file) is provided using form data (as an element named files in the form data).

A series of settings can also be provided as part of the form data (named options in the form data) using a JSON string format.

Name Description Allowed values
lang The list of languages (up to 3) for the text submitted separated by commas en: English (default)
de: German
fr: French
es: Spanish
ru: Russian
outputType The output type of the result txt: plain text (default)
xml: text with markup and additional attributes
pageSegMode The page segmentation mode 0: Orientation and script detection (OSD) only
1: Automatic page segmentation with OSD (Default)
3: Fully automatic page segmentation, but no OSD
4: Assume a single column of text of variable sizes
5: Assume a single uniform block of vertically aligned text
6: Assume a single uniform block of text
7: Treat the image as a single text line
8: Treat the image as a single word
9: Treat the image as a single word in a circle
10: Treat the image as a single character
11: Sparse text. Find as much text as possible in no particular order
12: Sparse text with OSD
13: Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific
modelType Type of the machine learning model for OCR lstmPrecise: precise model with LSTM cells
lstmFast: fast model with LSTM cells
lstmStandard: standard model with LSTM cells (Default)
noLstm: model without LSTM cells
all: model with LSTM cells and standard processing algorithms

The service will return a JSON response that includes the detected texts within the file in either text or hOCR format.

For more details, you can check the following link:

Log on to answer question
Step 2: Call the API

Open a new tab in Postman.

Make sure that the my-ml-foundation environment is selected.

On the Authorization tab, select Bearer Token, then enter {{OAuthToken}} as value.


Note:: the OAuthToken environment variable can be retrieved following the Get your OAuth Access Token using a REST Client tutorial.

Fill in the following additional information:

Field Name Value
URL the value for IMAGE_OCR_URL in your service key

Note As a reminder, the URL depends on you Cloud Platform landscape region but for the trial landscape only Europe (Frankfurt) provide access to the Machine Learning Foundation services.

On the Body tab, keep form-data selected. Add a new key named files and switch it to File instead of Text (default).

Select your image file.


If you are missing some content to test, you can use the following image:


Click on Send.

You should receive a response that includes the text found:

"predictions": [
    "This is a simple image with some text\nYou can try it with the SAP Leonardo Machine Learning Foundation OCR API.\n\nAs you see I'm using different colors on a dark background.\n\f"

You can also try with the following PDF file link.

Log on to answer question
Step 3: Validate your results

Provide an answer to the question below then click on Validate.

Paste the full response returned by the request.

Next Steps

Back to top