Developer's Documentation for free mobile OCR SDK

Documentation Menu

Specifications

Supported Operating Systems

Android

Android 4.4 and later for ARMv7 processors

iOS

iOS 8.x and later

Hardware Requirements

Processor capabilities

  • A multi-core processor (augmented reality scenarios require 4 cores)
  • Advanced SIMD (NEON)

Memory requirements

Library operation in text capture scenario takes up:

  • for texts in alphabetic languages — 40 MB RAM
  • for texts in Chinese, Japanese, or Korean languages — 70 MB RAM

Camera requirements

  • Autofocus lens
  • HD preview: generally recommended frame size is 720×1280, but it can vary depending on the scenario and processing speed

Available OCR Languages

dict — has dictionary support

extended — available only in extended version

Afrikaans

Albanian

Basque

Belarusian extended

Breton

Bulgarian dict extended

Catalan

Chechen extended

Chinese Simplified extended

Chinese Traditional extended

Crimean Tatar extended

Croatian

Czech dict

Danish dict

Dutch (Belgium) dict

Dutch (Netherlands) dict

English dict

Estonian dict

Fijian

Finnish dict

French dict

German (old spelling) dict

German (new spelling) dict

Greek dict

Hawaiian

Hungarian

Icelandic

Indonesian dict

Irish

Italian dict

Japanese extended

Kabardian extended

Korean extended

Latin

Latvian

Lithuanian

Macedonian extended

Malay

Maori

Moldavian

Mongol extended

Norwegian (Bokmal) dict

Norwegian (Nynorsk) dict

Ossetic extended

Polish dict

Portuguese (Brazil) dict

Portuguese (Portugal) dict

Provencal

Rhaeto-Romanic

Romanian

Russian dict extended

Samoan

Serbian extended

Slovak

Slovenian

Spanish dict

Swahili

Swedish dict

Tagalog

Tatar extended

Turkish dict

Ukrainian dict extended

Welsh

Translation Dictionaries

The following language pairs are available for translation in the extended version of the distribution:

EnglishtranslationChinese

EnglishtranslationFrench

EnglishtranslationGerman

EnglishtranslationIndonesian

EnglishtranslationJapanese

EnglishtranslationPolish

EnglishtranslationPortuguese (Brazil)

EnglishtranslationRussian

EnglishtranslationSpanish

You can also create your own dictionary and use it for translation. Contact our technical support for advice on the required format.

Regular Expressions

This section describes the regular expression syntax supported by the ABBYY Real-Time Recognition SDK engine for capturing custom data fields (see Capture a Custom Data Field: iOS and Capture a Custom Data Field: Android).

noteNote: All matches are always greedy (match as much as possible).

Supported syntax

Pattern

Syntax

Examples and comments

Literal

any character or text, except metacharacters \^$.|?*+()[{

pill matches "pill" in "caterpillar"
a matches the first "a" in "caterpillar" but not the second (see the above note)

Metacharacters are part of regular expression syntax; to match these literally, you have to escape them with a backslash. If you want to match 1+1, the correct expression is 1\+1 — otherwise "+" has a special meaning.

Any symbol

. (dot)

s.t matches "sat", "sit" but not "seat"

Character set

[]

matches a single character which may be any character from the set: gr[ae]y matches both "gray" and "grey" but not "graey"

Character range in a set

- (minus)

[0-9] matches a single digit
concatenation is allowed: [a-zA-Z0-9] matches an alphanumeric character

Negated character set

[^]

[^0-9] matches anything that is not a digit

Shorthand character sets

\s — any whitespace
\S — anything that is not a whitespace
\d — any digit
\D — anything that is not a digit
\w — a word character, which includes alphanumerics and punctuation marks
\W — a non-word character
\R — a new line character or the CR LF sequence
\v — a new line character but not the CR LF sequence
\V — a non-new line character
\h — a horizontal white space character
\H — anything except horizontal white space

 

Non-printable characters

\n — line feed LF
\r — carriage return CR
\t — tab character
\f — form feed
\a — bell character \u0007
\e — escape character

 

Unicode character

\uFFFF
\x{FFFF}

\u20AC or \x{20AC} matches the euro currency sign.

Character by its hexadecimal index

\xFF

\xA9 matches the copyright symbol in the Latin-1 character set

Alternation

|

abc|123 matches either "abc" or "123"
|word matches either an empty string "" or "word"

Repetitions

+ — matches once or more times
* — matches zero or more times
? — matches zero times or once (optional match)
{n} — matches exactly n times
{n,m} — matches n to m times
{n,} — matches n or more times
{,m} — matches zero or more times up to m

colou?r matches "color" and "colour"
[a-zA-Z0-9]{2,4} matches a 2-4 digit alphanumeric code

Note that all repetitions are greedy (prefer to match as much as possible): c.*r will match "caterpillar", not stopping with "cater" (in such cases negation works better: c[^p]* matches "cater").

Grouping

()

(word)+ matches "word", "wordword" and so on

Unsupported syntax

The following regular expression syntax features are not yet supported in ABBYY Real-Time Recognition SDK:

  • Anchors: ^ (beginning of a line), $ (end of a line), \b (word boundary) and its negation \B, and other.
  • Lazy quantifiers such as +? or {n,m}? that prefer to match as few times as possible.
  • Concatenation with nested character sets such as [[a-z][0-9]].
  • Advanced features such as lookarounds, backreferences, possessive matches, named groups, non-capturing and atomic match groups, evaluation flag settings and other.

Supported ID documents

Document

Supported in

All documents with Machine Readable Zone (MRZ)

All Countries

Bank cards: embossed and indent

All Countries

Driver’s license

Austria, Belarus, Czech, Finland, Germany, Israel, Italy, Japan, Portugal, Russian Federation, Sweden, Switzerland, Turkey, UK, USA(Alabama, Arizona, California, Florida, Washington DC, Massachusetts, Michigan, New Mexico, Texas)

International Passport

Austria, China, Germany, Germany, Italy, Japan, Philippines, Russian Federation, Syria, UK, USA

National ID card

Austria, Bahrain, Belgium, Bulgaria, Czech, Estonia, Germany, Kazakhstan, Kuwait, Kyrgyzstan, Latvia, Malaysia, Poland, Portugal, Romania, Singapore, Spain, Switzerland, Turkey, UAE

National passport

Belarus, Russian Federation

Personal insurance policy number

Russian Federation

Vehicle Registration Certificate (STS)

Russian Federation

VISA

Russian Federation

Health insurance card

Japan

Work permit

Singapore

Residence Permit

Spain

The countries and documents mentioned are supported in technical preview mode. We are working to improve our technologies.