Search
Close this search box.

Get text from PDF

Extracts text from a PDF files in a smart way, the extracted information can be used to rename the file in Power Automate, it can also be used as an input to other processes. Properties like the location of the text on the page and regular expressions can be used to fine tune the result.

Input Parameters

Required Parameters

File Name:
Data Type = string
The name of the source file, this will be used for the file name template.

File Content:
Data Type = string (byte – base 64 string)
The content of the source file, this should be converted to a base64 string if you are passing it from code, otherwise Power Automate handles this aspect.

Text Result Template:
Data Type = string
Template for the output text result if a text match is found, any occurrence of variables in the list below will be replaced by the appropriate value at runtime.

%VALUE1%:The text extracted from the first zone that was extracted, if no zone was provided all the text in the page will be returned.

%VALUE2%, …, %VALUEn%The text extracted from the nth zone that was extracted.

No Text Match Template:
Data Type = string
Template for the text to be returned if a text match is not found

Optional Parameters

File Name:
Data Type = string
The name of the source file, this will be used for the file name template.

File Content:
Data Type = string (byte – base 64 string)
The content of the source file, this should be converted to a base64 string if you are passing it from code, otherwise Power Automate handles this aspect.

Text Result Template:
Data Type = string
Template for the output text result if a text match is found, any occurrence of variables in the list below will be replaced by the appropriate value at runtime.

%VALUE1%:The text extracted from the first zone that was extracted, if no zone was provided all the text in the page will be returned.

%VALUE2%, …, %VALUEn%The text extracted from the nth zone that was extracted.

No Text Match Template:
Data Type = string
Template for the text to be returned if a text match is not found

Text Location:
Data Type = string
This represents the coordinates of a rectangle that covers the text you want us to extract. You can use this page to get the coordinates in relation to your input files.

Page (Deprecated):
Data Type = integer
This property is deprecated, we advise you to use the Pages property. The Pages property applies to all zones and allows you select the pages you want to process.

Text Pattern:
Data Type = string
If a regular expression is provided here, we will match any extracted text to it and return the match.

Text Select:
Data Type = string
Use this to refine the text you extract more, select an option that matches your requirements

  • text in zone: This option will select all the text that was extracted.
  • word after value: If this option is selected, this action will return the word that appears immediately after the expression supplied below.
  • word before value: If this option is selected, this action will return the word that appears immediately before the expression supplied below.
  • all text in line after value: If this option is selected, this action will return all the words that appear on the same line after the expression supplied below.
  • all text in line before value: If this option is selected, this action will return all the words that appear on the same line before the expression supplied below.
  • all text in zone after value: If this option is selected, this action will return all the words that appear in the selected zone after the expression supplied below.
  • all text in zone before value: If this option is selected, this action will return all the words that appear in the selected zone before the expression supplied below.

 

Text Value:
Data Type = string[]
Provide one or more value(s) here to be used with the property above, we will return the first text value that matches the rule stated above

Output Parameters

Text Result:
Data Type = string
A string generated from applying the extracted text to the file template provided.

Text Results:
Data Type = string
An array containing a list of pages and the extracted text values

Page Number:
Data Type = string
The page where the text was found

Page Text:
Data Type = string
A string generated from applying the extracted text to the Text Result Template provided.

Zone Values:
Data Type = string[]
An array containing the text extracted from each zone.

Is Successful:
Data Type = boolean
A boolean value specifying if the operation was successful or not.

License Info:
Data Type = string
Information about your API subscription key, it contains:

LicenseType
CallsRemaining
CallsMade
RenewalDate

Error:
Data Type = string
Contains the Error message returned by the operation if any exist.