How to recognize barcodes
You may need to read barcodes, for example, in order to separate documents scanned with high-speed production scanners, to classify and index documents, or to extract barcode values from individual fields of a document. Whichever goal you pursue, you can use Cloud OCR SDK to create your application which reads barcodes. There are two ways to do it.
The first one is usually suitable for document classification, indexing, or separation, i.e. in the situations when you need to extract barcode values but do not know where exactly in the document barcodes are located. In this case, you can read barcodes using the processImage or processDocument method with the barcodeRecognition profile as a parameter.
The second way is designed for extracting barcode values from certain fields of a document, or for recognizing images, which contain only barcodes. In this case, you should use the processBarcodeFieldmethod. This method allows you to specify exact barcode region in the document and the type of barcode.
Let's consider each of these ways in detail.
You should be a registered user of ABBYY Cloud OCR SDK. If you are not registered yet, follow the link to register.
During registration you create a login and password for Cloud OCR SDK site and an Application ID and Application Password for each application, which is going to use Cloud OCR SDK. This Application ID and Application Password should be passed to the server with each request. See details in Authentication.
After you have registered, you can use Cloud OCR SDK Web API to read barcodes.
Extracting barcodes from documents
Use the following parameters of the processImage or processDocument method:
- Set the value of the profile parameter to barcodeRecognition. In this case only barcodes will be extracted during document processing. No other information will be read, therefore there is no need to specify the language parameter.
- Select output file format by setting the exportFormat parameter. In this case, TXT (txtUnstructured) or XML (xml, xmlForCorrectedImage) formats work best. For example, XML output file can look like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <document xmlns="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml" version="1.0" producer="ABBYY FineReader Engine 11" languages=""> <page width="2496" height="3488" resolution="300" originalCoords="1"> <block blockType="Barcode"> <text> <par> <line baseline="0" l="0" t="0" r="0" b="0"> <formatting lang="">4566577677</formatting> </line> </par> </text> </block> <block blockType="Barcode"> <text> <par> <line baseline="0" l="0" t="0" r="0" b="0"> <formatting lang="">457345390028</formatting> </line> </par> </text> </block> </page> </document>
Call the selected method with the specified parameters and necessary authentication information. Monitor the task status using the getTaskStatus method until the task is processed. The result of processing can be downloaded by the reference provided in the XML response. You can find details on the main processing steps in How to Work with Cloud OCR SDK.
Reading barcode fields
To recognize a barcode field, use the processBarcodeField method with recognition parameters suitable for your image:
- Specify the region of a text field via the region parameter of the method. The region should contain only the barcode you want to recognize.
Or you can pass the image of a single barcode to the method. In this case there is no need to specify the region of the field—by default the whole image is recognized. If you crop the image of a barcode from a bigger image, do not decrease image resolution while cropping.
By default, the type of barcode is detected automatically, but you can specify the correct type of barcode before recognition. If you specify several barcode types, Cloud OCR system will try to recognize barcodes of specified types, ignoring all other variants. See the full list of barcodes that can be recognized by Cloud OCR system.
For PDF417 and Aztec barcodes only. PDF417 and Aztec barcodes can encode both text and binary data. If the barcode can contain binary data, set ContainsBinaryData parameter of the method to true. In this case, the binary data encoded in a barcode is saved as a sequence of hexadecimal values for corresponding bytes. Binary data is returned in Base64 encoding.
Call the processBarcodeField method with the specified parameters. Monitor the task status using the getTaskStatus method until the task is processed. The result of processing is returned in XML format and can be downloaded by the reference provided in the XML response. The output XML file has the following format:
<?xml version="1.0" encoding="utf-8" standalone="yes" ?> <document xmlns="@link" xmlns:xsi="@link" xsi:schemaLocation="@link" version="1.0"> <field left="0" top="0" right="557" bottom="272" type="barcode"> </value encoding="Base64"> NgA1ADIAOQAxADQAMQA2ADcANwAwADcAOQA= </value> </field> </document>
The value element contains the recognized value of the barcode in Base64 encoding. To read text from this value, decode the value from Base64 encoding and read it as UTF-16 string.
You can find details on the main processing steps in How to Work with Cloud OCR SDK.