This post is an introduction to optical character recognition.
OCR Methods
Mahalanobis is a method to calculate the distance between multidimensional vectors, and it is used in OCR. It uses the concept of Mahalanobis distance.
OCR Software
This is a non-extensive list of OCR (Optical Character Recognition) software:
- tesseract
- pytesseract
- OCRmyPDF
- ABBYY FineReader
- OnmyPage
- Readiris
One of the most popular proprietary OCR software is ABBYY FineReader.
tesseract
tesseract-ocr is a FOSS library.
It is written in C++.
tesseract-ocr official website
pytesseract
pytesseract is a FOSS Python library for OCR.
ABBYY FineReader
Developed by ABBYY (Russia).
Propietary and non-free
OmnyPage
Developed by Nuance (US).
Propietary and non-free.
Readiris
Developed by I.R.I.S. (Belgium), a company from Canon group (Japan).
Proprietary and non-free
http://www.irislink.com/EN-ES/c1682/Readiris-16-for-Windows—OCR-Software.aspx