Home » Resources » PDF Connector User Guide » Get data from PDF
Get data from PDF
Input Parameters
Required Parameters
File Content:
Data Type = string (byte – base 64 string)
The content of the source file
Optional Parameters
Expected Keys:
Data Type = string (One key per line)
Provide one key name per line to make values available to later actions without parsing JSON.
Page Limit:
Data Type = integer
Maximum number of pages to be processed
Page Range:
Data Type = string
A string representation of the page numbers you want to process. E.g. 1,3-4
Dates as ISO:
Data Type = boolean
Set this to true if you want the date values to be returned as an ISO Date
Confidence Score:
Data Type = number
Set a higher confidence score to filter out values with lower confidence. The value range is between 0 and 1, and the recommended value for meaningful key/value pairs is 0.5 and above
Strip Currency Symbols:
Data Type = boolean
Set this to true if you want the symbols and strings to be removed before we return currency values
Match Synonym:
Data Type = boolean
Set this to true if you want us to return all the keys that are synonyms to the expected key
Synonym Dictionary:
Data Type = string
You can provide a JSON array of “entry” objects, where each object contains a list of synonyms in an array. For instance, if you want “Invoice No” and “Invoice Number” (case-insensitive) to be interpreted as the same key, use the following JSON: [{‘entry’: [ ‘Invoice No’, ‘invoice number’ ]}]
Trim Symbols:
Data Type = boolean
Set this to true if you want us to remove all leading and trailing symbols from the keys found before we match them to an expected key.
Output Parameters
- LicenseType
- CallsRemaining
- CallsMade
- RenewalDate
Error message: Data Type = string Error message Is Successful: Data Type = boolean This will be true if data was extracted from the file. Pages: Data Type = object[] A list of pages containing Key/Value pairs.
Page Number:
Data Type = integer
The page number of the current page
Page Height:
Data Type = integer
The height of the current page
Page Width:
Data Type = integer
The width of the current page
Page Key/Value Pairs:
Data Type = object[]
A list of Key/Value pairs extracted from a page.
Key:
Data Type = object
Object representing a Key
Key Text:
Data Type = string
String showing the content of the key
Bounding Box:
Data Type = object
A rectangle representing the position of the key on a page.
Top:
Data Type = number
The top coordinate of the bounding box
Left:
Data Type = number
The left coordinate of the bounding box
Height:
Data Type = number
The height of the bounding box
Width:
Data Type = number
The width of the bounding box
Values
Data Type = object[]
Object representing a Value
Value Text:
Data Type = string
String showing the content of the value
Confidence:
Data Type = number
A score of how confident we are that this value matches with the key.
Bounding Box:
Data Type = object
A rectangle representing the position of the Value Text on the page.
Top:
Data Type = number
The top coordinate of the bounding box
Left:
Data Type = number
The left coordinate of the bounding box
Height:
Data Type = number
The height of the bounding box
Width:
Data Type = number
The width of the bounding box