Skip to Content

Use Machine Learning to Extract Information from Documents

test
0 %
Use Machine Learning to Extract Information from Documents
Details

Use Machine Learning to Extract Information from Documents

October 21, 2020
Created by
October 14, 2020
Get machine learning model predictions for the documents you upload using the Document Information Extraction Trial UI.

You will learn

  • How to use the Document Information Extraction Trial UI to upload new documents
  • How to see and edit the extraction results
  • How to delete documents

Prerequisites

The core functionality of Document Information Extraction is to automatically extract structured information from documents using machine learning. When you finish this tutorial, you will get field value predictions for the documents you upload to Document Information Extraction Trial UI.


Step 1: Upload documents

Document Information Extraction uses a globally pre-trained machine learning model that currently obtains better accuracy results with invoices and payment advices in the languages listed in Supported Languages and Countries. The team is working to support additional document types and languages in the near future.

Upload to the service any document file in PDF or single-page PNG and JPEG format that has content in headers and tables, such as an invoice.

As an alternative to uploading your own documents to the service, you can use the following sample invoice files (right click on the link, then click Save link as to download the files locally):

  1. Open the Document Information Extraction Trial UI, as described in the previous tutorial: Subscribe to Document Information Extraction Trial UI.

    DOX-UI-App
  2. In the top right, click + (Upload a new document).

    DOX-UI-App
  3. In the Select Document screen, drop files directly or click + to upload one or more document files.

    DOX-UI-App
  4. Select the Document Type. Click Step 2.

    DOX-UI-App
  5. In Step 2, select the header fields you want to extract from the documents you uploaded in the previous step. Click Step 3.

    DOX-UI-App
  6. In Step 3, select the line items you want to extract from the documents you uploaded in the previous step. Click Review.

    DOX-UI-App
  7. Review your selection. Click Edit if you want to change anything. Click Confirm.

    DOX-UI-App

    You see the Document Name, Upload Date and Status of the documents you have just uploaded.

    DOX-UI-App

    Status changes from PENDING to READY. This means the selected header fields and line items have been extracted, and the extraction results are ready to be validated and changed if necessary. If status changes from PENDING to FAILED, this means it was not possible to get the extraction results, and you need to upload the document once again.

    DOX-UI-App

CAUTION:

Be aware of the following Document Information Extraction Trial UI trial account limitation:​

  • Maximum 40 uploaded document pages per week​ (the documents can have more than 1 page)​
Choose the document types supported by Document Information Extraction Trial UI.
×
Step 2: See and edit extraction results
  1. In the Documents screen, click the document row where you see Document Name, Upload Date and Status.

    DOX-UI-App

    You see the page preview of the document file you uploaded.

    DOX-UI-App
  2. Click Extraction Results to see the Header Fields and Line Items extraction results.

    DOX-UI-App
  3. In case corrections are needed and the document status is READY, you can Edit the Header Fields and Line Items extraction results.

    DOX-UI-App

    See an example where the Currency Code is edited:

    DOX-UI-AppDOX-UI-AppDOX-UI-App
  4. You can also Add Line Item and Delete Last Line Item.

    DOX-UI-App
  5. Select values in the document page preview, one each time, to Assign Field by choosing in the dropdown list the Field name. Add or change the extraction Value if necessary. Click Apply to add the selected field into the Header Fields or Line Items extraction results.

    See an example where the Tax Amount value is selected in the document page preview and added to the Header Fields extraction results:

    DOX-UI-AppDOX-UI-App
  6. Save your changes.

    DOX-UI-App
  7. You can also Edit and Confirm the document.

    DOX-UI-App

    Status changes from READY to CONFIRMED. This means the extraction results have been confirmed and can no longer be changed.

    DOX-UI-App
Log on to answer question
Step 3: Delete documents
  1. In the Documents screen, click the document row where you see Document Name, Upload Date and Status.

    DOX-UI-App

    You see the page preview of the document file you uploaded.

    DOX-UI-App
  2. Click Delete and then click OK to delete the document you selected.

    DOX-UI-App

    The document is then removed from the Documents list.

    DOX-UI-App

Congratulations, you have completed this tutorial.

Log on to answer question

Next Steps

Back to top