Document Reader API

Table of Contents


Fields

Field name

Comment

Field name

Comment

firstName

 

lastName

 

documentNumber

 

issueDate

Always returned in format yyyy-MM-dd

expireDate

Always returned in format yyyy-MM-dd

birthDate

Always returned in format yyyy-MM-dd

personalCode

 

address

Usually occurs in IdentityCard type documents

nationality

Always an iso3 letter code

country

Always an iso3 letter code

state

Always an iso2 letter code

sex

Value is returned as either F, M or X

category

Only occurs in DriverLicense type documents

street

Only occurs in ProofOfAddress type documents

city

Only occurs in ProofOfAddress type documents

parish

Only occurs in ProofOfAddress type documents

  • Each of the above-given fields is either a string type or a date type.

  • BirthDate, IssueDate, and ExpiryDate are all date types. All the rest are string types.

  • Additionally, each field carries information about the field value and its validity.


"firstName": { "value": "Jessica", "isValid": true }
  • value - actual value of the extracted field.

  • isValid - always true or false, based on internal AI model validation.


Validation aspects to isValid value

When the same value (for example firstName) is present in the OCR zone and in the MRZ zone, in such cases, the values in OCR and MRZ must match. Otherwise, the isValid property will be false.

  • Additionally, MRZ checksum validation also comes into play here. If both values in MRZ and OCR match but field checksum validation failed, the field's isValid property will be false.

Some fields like documentNumber have a predefined pattern for each country. For example, AI identified the country as LTU, and the document type as DriverLicense, and we know that these documents have a documentNumber composed of 8 digits. If this criteria is not met, isValid will be false.

If any field information from the front is duplicated on the back - these fields must match as well. Otherwise, the isValid will be false again.

If only part of the document was uploaded - the missing parts that AI expected, but not found - will be invalidated (marked as isValid = false).

  • For example, some IdentityCards have MRZ and IssueDate on the back side. If a user uploads the front side only - then everything that is expected to be found on the back - will be isValid = false.

If a field is defined, but due to some reasons (bad image quality, field is covered, or algorithm failure to find it), the information was not extracted - the resulting value of that field will be isValid - false and the value will not be returned (a.k.a. is equal to null).


"firstName": { "isValid": false // not that `value` property is not being returned }

Fields that not supposed to be in the given document is not returned at all. For example Passport documents don’t have Category so this field in response will not be present (a.k.a. is equal to null).


Available document types

  • IdentityCard

  • Passport

  • ResidencePermit

  • DriverLicense

  • InternalPassport

  • SocialId

  • ProofOfAddress

  • Unknown

  • Other


Supported image types

  • png

  • jp(e)g

  • bmp


Possible issues during data extraction (returned as bad requests)

  • No document in given image(s) - returned if AI failed to find a document in at least one of the given images.

  • Invalid base64 image - returned if one or both given base64 image strings fail to resolve to a valid image type.

  • Document unknown - AI succeeded in finding a document, but the document failed to be identified. This is the same as an unknown document type.

  • Same sides - returned only when 2 sides of the document were given and both of these sides were identified either as front or back.

  • Passport multiple images - returned only when 2 images were given and at least one image was classified as a Passport. For passport parsing, users should only upload one side since the cover image contains no information.

  • Generic parsing failure - returned in all other cases when something goes wrong during processing. Additionally, an error ID is returned for the user.