How to Recognize Receipts with ABBYY Cloud OCR SDK

Receipt recognition is a specific kind of document processing. Location of data fields is not fixed, but depends on the country where the receipt was printed and the issuing organization. As receipts are often printed using small fonts on low quality paper, and the pictures are made with a mobile phone instead of scanned, special preprocessing is required before OCR. All these conditions make data capture and recognition more complicated.

Using Cloud OCR SDK, you can recognize an image of a receipt and then extract data from the necessary fields, e.g. the total amount, the type of purchase, payment type, the name of the organization which produced the receipt, etc. With this API, there is no need to know the exact location of the fields, Cloud OCR SDK will find them for you and retrieve the values in the XML format.

See Photographing and Scanning Receipts for tips on how you can improve the quality of input images, ensuring more accurate data extraction.


Before you start working with ABBYY Cloud OCR SDK you should register on the site. Follow the link for registration.

During registration the login and password for Cloud OCR SDK site will be sent to your email. Also you will create Application ID and Application Password. This information will be necessary to access the processing server, see Authentication.

After registration you can use Cloud OCR SDK for receipt recognition.

Recognizing Receipts

To recognize receipts, use the processReceipt method with recognition parameters suitable for your image:

  1. Specify one or more country where the receipt was printed via the country parameter of the method. Several names of countries should be separated with commas, for example "taiwan,china".
  2. Specify if the image is a photograph or a scanned image via the imageSource parameter. This affects the preprocessing operations which can be performed with the image such as automatic correction of distorted text lines, poor focus and lighting on photos. You may set the parameter to ‘auto’ value. In this case the image source will be detected automatically.
  3. Specify if the skew or the orientation of the image should be automatically detected and corrected.

Call the processReceipt method with the specified parameters. A new processing task will be created on the server. Monitor the task status in a loop using the getTaskStatus method until the task is processed. You can find details on the main processing steps in How to Work with Cloud OCR SDK

The output XML file has the following format:

<?xml version="1.0" encoding="UTF-8" ?>
<receipts count="1" xmlns="">
 <receipt currency="USD">
  <vendor confidence="73.71695592" isSuspicious="false"</value>
    <text><![CDATA[175 RANCH DR]]></text>
   <phone confidence="100" isSuspicious="false">
   <purchaseType>General Retail</purchaseType>
   <city confidence="20" isSuspicious="true">
   <zip confidence="63" isSuspicious="true">
     <text>CA 95035</text>
   <administrativeRegion confidence="100" isSuspicious="false">
  <total confidence="67" isSuspicious="true">
    <text>PA 93</text>
  <tax total="false" rate="8.75">
    <text>8.750% 2 01</text>
  <payment type="Undefined"  confidence="0" isSuspicious="true">
     <text>PA 93</text>
  <recognizedItems count="3" >
   <item index="1" >
    <name confidence="0"  isSuspicious="true" >
     <text>TOY BRD 4LB</text>
    <total confidence="43"  isSuspicious="true" >
    <recognizedText><![CDATA[0073052151457 TOY BRD 4LB 11.89
F&F Savings 2.10-
(RETURN PRICE 11.89 EA)]]></recognizedText>
    <sku confidence="51"  isSuspicious="true" >
    <amountUnits confidence="0"  isSuspicious="true" >Unknown</amountUnits>

The elements and attributes are described in detail in Output XML with Receipt Data. See also the XSD schema of an XML file.

See sample implementation of this procedure in C#.