Mindee, the API-first platform designed for developers to eliminate manual data entry, announced the introduction of docTR an accessible open source library for OCR-related tasks powered by deep learning.
Mindee’s docTR provides optical character recognition with accessibility for the entire developer community. Combining textual parsing through text and object detection and recognition, this open-source repository offers a wider range and complex use cases. Going beyond the textual elements, it provides a holistic view of information encoded in visual forms, including QR codes, barcodes, information in ID pictures, and even logos.
Powered by the machine learning tool of user’s choice, TensorFlow 2 or PyTorch, DocTR features training capabilities for text detection in documents and images as well as recognition with pretrained parameters. It incorporates a five-line code to load documents, extract text with a predictor, and optimize for end-to-end performances, including inference speed on both CPU and GPU.
“DocTR offers open-source tools to develop and deploy python OCR at scale with PyTorch or TensorFlow,” said Nicolas Schuhl, Head of Delivery at Monk.
With this offering, Mindee provides a wide audience, from entry-level developers to domain experts who want to train their model (researchers), the tools to support efforts in their transformation from intensive manual data entry (e.g., from physical documents, PDFs or images) to a full digital process.
“We made this code available with that in mind, to ensure developers can read it, understand it and be sure it’s safe. We are providing everyone with the possibility of making this OCR tool their own by allowing them to modify the code to fit their applications and infrastructure needs.” said Frédéric Harper, Director of Developer Relations at Mindee.