Developer's Documentation for free mobile OCR SDK

Documentation Menu

How to Capture Text from Camera: iOS

The purpose of Real-Time Recognition SDK is to enable your application to capture information directly from the smartphone camera preview frames, without actually snapping a picture.

Important! With common licenses, your application needs an Internet connection to gather the information about the library current state. If the application was not able to connect to the Internet for 90 days, the library will not be available until the connection is reinstated. To remove this limitation, please contact sales.

Adding the library to your Xcode project

  1. Add AbbyyRtrSDK.framework to your Xcode project.
  2. Add the license file to your Xcode project (simply drag and drop it into your project window).
  3. Select your Xcode project in the Target group and go to the Build Phases tab. In the Link Binary With Libraries section, click on "+" and add also libc++.tbd library.
  4. Now you need to add the resource files and set up the copying rules. There are three types of resources used by the library: dictionaries, patterns, and translation dictionaries. (See Minimize Your Memory Footprint for a short description of the necessary resources.) For each type of resources:
    1. Go to Build Phases and add a new Copy Files phase.
    2. In Destination field specify Resources.
    3. In Subpath field specify Dictionaries (or Patterns, or Translation respectively).
    4. Add the dictionary files (or patterns, or translation dictionaries) for the languages you need.

Steps to capture text

This section walks you through a simple text capture scenario, in which the user points the camera at the text which should be recognized.

To implement this scenario, perform the following steps:

  1. Implement a delegate conforming to the RTRRecognitionServiceDelegate protocol which will be used to pass the data to and from the recognition service. Here are the recommendations on what its methods should do:
  2. Create the RTREngine object with the help of the sharedEngineWithLicenseData: method. The method requires the path to the license file.
  3. Use the createTextCaptureServiceWithDelegate: method of the RTREngine object to create a background recognition service. Only one instance of the service per application is necessary: multiple threads will be started internally.
  4. Set up the processing parameters, according to the kind of text you expect to capture.
    The default text language is English; if you need other languages, specify them by the call to the setRecognitionLanguages: method.
    We also recommend calling the setAreaOfInterest: method to specify the rectangular area on the frame where the text is likely to be found. For example, your application may show a highlighted rectangle in the UI into which the end user will try to fit the text they are capturing. The best result is achieved when the area of interest does not touch the boundaries of the frame but has a margin of at least half the size of a typical printed character.
  5. Initialize the camera and start receiving notifications from a AVCaptureVideoDataOutputSampleBufferDelegate object; when it provides a video frame via the captureOutput method, pass it on to the recognition service by calling the addSampleBuffer: method of the RTRRecognitionService protocol.
    We recommend using the AVCaptureSessionPreset1280x720 preset for the camera settings.
  6. Process the messages sent by the service to the RTRRecognitionServiceDelegate delegate object.
    The result will be delivered via the onBufferProcessedWithTextLines:resultStatus: method. It also reports the result stability status, which indicates if the result is available and if it is likely to be improved by adding further frames (see resultStatus parameter). Use it to determine if processing should be stopped and the result displayed to the user. Do not use the result until stability level reaches RTRResultStabilityAvailable.
    The result consists of one or more text lines represented by objects of the RTRTextLine class. Each RTRTextLine contains information about the enclosing quadrangle of a single line of text and the recognized text as a string.
    Work with the results on your side.
  7. When pausing or quitting the application, call the stopTasks method to clean up the resources and return the service in the default state. The processing threads will be started again on the new call to the addSampleBuffer: method.

See the description of classes and methods further in this section.