Aquaforest SDK

OCR and Data Extraction SDK

A robust SDK for C# and VB applications to extract data from PDFs. Leverage additional SDK capabilities like handwriting OCR, text extraction, document compression, barcode scanning, and more.
Play Video

Overview

Aquaforest SDK is a powerful toolset for processing PDFs including:

  • PDF content extraction
  • Searchable PDF Creation
  • OCR with Standard (Aquaforest) Engine
  • OCR with Extended (Canon IRIS) Engine
  • Handwriting OCR options via Google & Microsoft APIs
  • Advanced PDF and Barcode Toolkit
  • High Performance with Support for up to 64 Cores

Main Features

PDF Data Extraction

The SDK is able to analyse PDF documents and automatically extract name/value pairs.

PDF Tools

The SDK has a wide variety of PDF manipulation capabilities including PDF merging, PDF attachment processing, PDF content extraction, XMP metadata processing, PDF/A validation and more.

Standard OCR

The Standard OCR Engine supports 23 languages (see the full list) and is included in every edition of the SDK.

Extended OCR

The Extended OCR Engine supports over 100 languages (see the full list) and is included in the Extended Edition licenses.

Cloud OCR

This provides an interface to Google and Microsoft’s cloud OCR services which can be especially useful for special cases such as handwriting recognition.

Barcodes

The SDK is able to read and recognize most standard barcode types.

Get a Quote

Please contact the sales team for pricing information : sales@aquaforest.com

Licensing

Development Licenses

Each developer building applications requires a Developer License. Please note that each Aquaforest SDK Bundle purchase includes one developer license and one server runtime license. Additional developer licenses can be purchased for multiple developers and significant discounts are available – please contact sales@aquaforest.com for details.

Additional Server Runtime Licenses

In most cases all that is required is the purchase of a single developer license bundle which includes a server runtime license. However, if additional runtime servers are required then each requires a runtime license. A Server is defined as a physical or virtual machine that is used for automated unattended processing or to provide services to multiple end users.

Desktop Runtime Licenses

For use within a single organization, the purchase of a development license bundle allows deployment to an unlimited number of desktops*. Please contact sales@aquaforest.com if you have additional requirements.

“Desktop” is defined as an “end-user” personal computer that may have Applications installed that utilize the Software. The Applications must NOT support unattended document processing. No more than 10 page images may be processed without requiring human intervention.

*For Aquaforest Standard OCR only if you require Desktop Runtime Licensing for the Extended OCR please contact sales@aquaforest.com.

Unlimited Runtime Licenses

In cases where the SDK is being used as a component within a business application that will be sold to multiple end customers then the Unlimited License Bundle may be the most suitable license. This allows an unlimited number of Developers & Runtimes for one business application.

License Comparison Table

Edition Comparison Standard Extended
PDF Toolkit
Data Extraction from PDF documents without the need for templates or prior training
Barcode Decoding
OCR from bitmap, TIFF and PDF
Microsoft Cloud OCR (requires additional Microsoft Subscription)
Google Cloud OCR (requires additional Google Subscription)
Image Pre-Processing and Auto-Rotation
.NET Programmatic and Zonal access to OCR results
RTF and TXT output
Blank Page Removal
PDF Merging
Searchable PDF Output
Stamps on PDF Output
Advanced MRC and JBIG2 Compressed PDF Output
Advanced Pre-processing (Optimized OCR)
Aquaforest OCR Support for 23 languages
Extended IRIS OCR with Support for 131 languages
Support for multiple languages within a single document from the same character set
Multiple document output formats:
PDF, DOCX, WORDML, RTF, CSV, XLSX, EXCELML, TXT, HTML and XPS
Multiple PDF version output support
Confidence score support
Asian Language Support
Arabic Language Support
Hebrew Language Support
Intelligent High Quality Compression

FAQ

The Standard bundle includes support for 23 languages.

The Extended bundle includes support for 131 languages.

The Extended bundle language list includes Chinese (Traditional and Simplified), Japanese, Korean, Thai and Vietnamese.

If your requirements change during the use of the product and you require additional Cores or another Module we can simply upgrade your license with the difference in price.
We can demonstrate the product for you and discuss how it can meet your needs.
Our team have gained extensive experience and expertise in searchable PDFs over many years and are members of the PDF association. We are happy to share our knowledge and provide free advice in this area.

Email
We aim to respond to email support requests within 1/2 a business day- usually we respond much more quickly than that. Email support@aquaforest.com with any support query.

Phone support
If you prefer to speak directly with our team call us on +44 (0)1296 768 727 or ask for a call via support@aquaforest.com with any support query.

Live chat
You can always contact us on live chat during office hours.

Tech Spec

Searchable PDFs

Aquaforest’s OCR engine, capable of processing thousands of pages per hour, is used to recognise text from source TIFF and Image-Only PDF files and to create Searchable PDF files.

PDF Data Extraction

The Aquaforest Data Extractor allows data extraction from PDF documents without the need for templates or prior training. The software is able to read the PDF text and extract important key-value pairs automatically, making processing of files with various layouts easy.

Image Preprocessing

For optimal OCR recognition, options are available to control deskew, despeckle, graphics area treatment and auto-rotate.

Simple .NET integration

The SDK has been designed to be simple to integrate with .NET applications and complete samples are provided in C#, VB.NET and ASP.NET.

Fully Searchable PDF Generation

The SDK can be used to generate fully text searchable PDFs with the original image and a transparent text layer.

System requirements

Supported Operating Systems Windows 10
Windows Server 2012 R2
Windows Server 2016
Windows Server 2019
Minimum Memory Single Core License - 4 GB RAM
Recommended Memory Single Core License - 8 GB RAM
8 Core License - 16 GB RAM
Greater Than 8 Core License - Ask support@aquaforest.com
Recommended CPU Single Core License - i5 processor
8 Core License - i7 processor
Greater Than 8 Core License - Ask support@aquaforest.com
Disk Space Clean Install: 1.31GB
All Samples Compiled: 4.75GB
.NET Framework 4.7.2
Visual C++ Runtime The Visual C++ Redistributable package is required for deployment as well as development.
The Aquaforest engine requires Visual C++ 2017 Redistributable (x86 | x64)
Autobahn DX

Start using Autobahn DX today and convert your archives to fully text searchable PDF today.