Create Schema for Purchase Order Documents

Intermediate

20 min.

Machine Learning, SAP AI Services, Cloud, Document Information Extraction, Intermediate, Tutorial, SAP Business Technology Platform, Free Tier, Artificial Intelligence

Create a schema for your purchase order documents to extract information from similar documents using the Document Information Extraction service.

You will learn

How to create a schema for purchase order documents
How to add standard and custom data fields for the header and line item information of purchase order documents

Juliana MoraisJuly 16, 2024

Created by

October 18, 2021

Contributors

The core functionality of Document Information Extraction is to automatically extract structured information from documents using machine learning. The service supports extraction from the following standard document types out of the box: invoices, payment advices, and purchase orders. You can customize the information extracted from these document types by creating a schema and adding the specific information that you have in your documents. Additionally, you can add completely new document types.

If you are new to the Document Information Extraction UI, first try out the tutorial: Use Machine Learning to Extract Information from Documents with Document Information Extraction UI.

Step 1
Open the Document Information Extraction UI, as described in the tutorial: Use Trial to Set Up Account for Document Information Extraction and Go to Application or Use Free Tier to Set Up Account for Document Information Extraction and Go to Application.

If you HAVE NOT just used the Set up account for Document Information Extraction booster to create a service instance for Document Information Extraction and subscribe to the Document Information Extraction UI, observe the following:

To access the Schema Configuration and Template features, ensure that you use the blocks_of_100 plan to create the service instance for Document Information Extraction Trial.

And make sure you’re assigned to the role collection: Document_Information_Extraction_UI_Templates_Admin_trial (or Document_Information_Extraction_UI_Templates_Admin if you’re using the free tier option). For more details about how to assign role collections, see step 2 in the tutorial: Use Trial to Subscribe to Document Information Extraction Trial UI, or step 3 in the tutorial: Use Free Tier to Subscribe to Document Information Extraction UI.

After assigning new role collections, Log Off from the UI application to see all features you’re now entitled to try out.

In the left navigation pane, click Schema Configuration.

Here, you find the SAP schemas. The Document Information Extraction UI provides preconfigured SAP schemas for the following standard document types:

Purchase order

Payment advice

Invoice

In addition, there’s an SAP schema for custom documents (SAP_OCROnly_schema). You can use SAP schemas unchanged to upload documents.

NOTE: You can’t edit or delete original SAP schemas. Always create a copy and then edit the default fields, as required.

CAUTION:

When using the free tier option for Document Information Extraction or a trial account, be aware of the technical limits listed in Free Tier Option and Trial Account Technical Constraints.
Step 2
To create your own schema, click Create.

In the dialog that appears, enter a name for your schema, purchase_order_schema, for example. Note that the name cannot include blanks. Further, select Purchase Order as your Document Type.

Click Create to create the schema.

Now, your schema shows up in the list. Access the schema by clicking on the row.
Step 3
A schema contains a list of header fields and line item fields representing the target information that you want to extract from a particular type of document. You must select a schema when you add documents to the Document Information Extraction UI.

You can either create your own schema from scratch or use a preconfigured SAP schema. If you don’t want to configure your own schema, you can select the appropriate SAP schema unedited when you add a document on the Document Information Extraction UI. No configuration is needed when you use SAP schemas in this way. Alternatively, you can copy a suitable SAP schema and edit the default fields in line with your needs.

Document Information Extraction already includes a number of fields that it can extract. See here which header fields are supported and here which line item fields are supported. Additionally, you can define custom fields. In the next step, you’ll learn about both.

The image below shows an example of a purchase order. All the fields that you define in your schema in this tutorial are highlighted. The header fields represent all information outside the table that occurs once. The line item fields represent all information within the table that occurs for each product. You can, of course, extend or reduce the information that you want to extract.
Choose the appropriate example of a line item field.
Description
Purchase Order Number
Shipping Address
Shipping Amount

Step 4

To define your first header field, click Add to the right of the heading Header Fields.

For each field, you have to enter a name, a data type, a setup type, and optionally a default extractor and a description. The available data types are string, number, date, discount, currency, and country/region.

The available setup types are auto and manual. The setup type auto supports extraction using the service’s machine learning models. You must specify a default extractor (standard fields supported by Document Information Extraction) for this setup type. It can only be used in schemas created for standard document types. The setup type manual supports extraction using a template. It’s available in schemas created for standard or custom document types.

If you’d like to find out more about setup types and how they relate to document types, extraction methods, and default extractors, see Setup Types.

As your first header field, add the purchase order number, which identifies your document.

Enter an appropriate name for your field, purchaseOrderNumber, for example.
Select string for the Data Type. Note that a document number is a string, even though it consists of numbers, as it is an arbitrary combination of numbers without meaning. In contrast, price is an example of the data type number.
As all business documents have a unique identification, Document Information Extraction already includes a standard field. Select auto for the Setup Type and then select documentNumber for the Default Extractor.
Click Save to add the header field.

The field now displays in your list of header fields, where you again find all the information that you’ve just entered. You can edit or delete the field by clicking the respective icons on the right.

You’ve now added your first header field that uses a default extractor from Document Information Extraction. Next, you’ll add your first custom header field,

Click Add again to open the dialog.

Enter an appropriate name for your field, purchaseOrderStatus, for example.
Select string for the Data Type.
As Document Information Extraction offers no equivalent field, select manual for the Setup Type. Click Save to add the field.

You’ve now added your first custom field. Go ahead and add the header fields shown in the table and image below. Pay attention to which fields have a default extractor and which don’t. Feel free to extend or reduce the list of header fields.

Field Name	Data Type	Setup Type	Default Extractor
`purchaseOrderNumber`	string	auto	`documentNumber`
`purchaseOrderStatus`	string	manual	none
`vendor`	string	auto	`senderName`
`vendorSite`	string	auto	`senderAddress`
`shipTo`	string	auto	`shipToAddress`
`orderType`	string	manual	none
`terms`	string	auto	`paymentTerms`
`orderCurrency`	string	auto	`currencyCode`
`entryDate`	date	auto	`documentDate`
`shipDate`	date	auto	`deliveryDate`
`cancelDate`	date	manual	none
`totalCostNet`	number	auto	`netAmount`
`totalCostGross`	number	auto	`grossAmount`
`totalVatAmount`	number	manual	none

NOTE: The Document Information Extraction UI also includes a feature that allows you to group schema fields by category. To use this feature, you must first activate it under UI Settings. For simplicity’s sake, we haven’t included the feature in this tutorial. If you’d like to find out more about it, see Schema Field Categories.

Step 5

Next, you need to define the line item fields. As your first line item field, add the SKU (stock keeping unit) that uniquely identifies an article.

Click Add to the right of the headline Line Item Fields.

In the dialog proceed as follows:

Enter an appropriate name for your field, skuNumber, for example.
Select string for the Data Type.
Select manual for the Setup Type and click Save to add the field.

The field now displays in your list of line item fields, where you find all the information again that you’ve just entered.

You’ve now added your first line item field. Go ahead and add the line item fields shown in the table and image below. Pay attention to which fields have a default extractor and which don’t. Feel free to extend or reduce the list of line item fields.

Field Name	Data Type	Setup Type	Default Extractor
`skuNumber`	string	manual	none
`description`	string	auto	`description`
`upcNumber`	string	manual	none
`quantity`	number	auto	`quantity`
`unitPriceNet`	number	manual	none
`unitPriceGross`	number	manual	none
`vatRate`	number	manual	none
`totalCost`	number	manual	none

Step 6
Once you’ve added all header and line item fields, you need to activate the schema so that you can use it to extract information from documents. Right now, the schema has the status DRAFT, indicating that it cannot be used yet.

To activate the schema, click Activate.

Now, the status of your schema changes to ACTIVE. To make changes to your schema, you have to Deactivate it first.

Congratulations, you’ve created and activated your own schema for purchase order documents.

In the next tutorial: Create Template for Purchase Order Documents, you’ll create a template that uses your schema, and associate documents with your template to show the Document Information Extraction service where each field is located in the document.

Access schema configuration
Create schema
Understand schemas
Add header fields
Add line item fields
Activate schema

Create Schema for Purchase Order Documents

Choose the appropriate example of a line item field.

Developer Products

Trials & Downloads

Site Information