Aquaforest PDF Connector: Get Data from PDF from Image Only & Text Searchable PDFS

In this article, we will outline how to use the Aquaforest PDF Connector for the Power Automate Platform to Get Name-Value Pairs from a mixture of image-only and text-searchable PDFs & populate them into Custom Metadata fields. The first step is to define the trigger for our flow, in this example, we are going to Trigger the flow when an item gets created in Sharepoint & then using the Aquaforest “Get Data from PDF” to retrieve the Name-Value pairs before we populate into Custom Metadata Fields 1. Create a new Automated Flow, give it a name “Get data from PDF & including OCR Check & Select your Trigger “When a file is created in a folder” 2. 2. Specify the Location, 3. We then need to add a step to get the contents of the file
  • Specify the Site Address & also “Identifier”
4. a. Add an “Aquaforest Get PDF Properties” Step b.Select “File Content” 5. Add a “Condition” Step, “Is Searchable Is equal to True” 6. On the “No” Branch, add “Aquaforest OCR PDF or Images” step 7. In the Aquaforest OCR PDF or Images, add the following parameter a. “Source File Content” as “File Content” b. “Source Filename with Extension” as “Filename with Extention” 8. Then add a “Get Data from PDF Step” with the following Expected Keys
  • We then specify the following parameters,
a. File Content: Aquaforest Processed file Contents (from the OCR Step) b. Expected Keys: Title, Name, Invoice Number & Grand Total 9. Then add a new step “Update File Properties” a. Enter the site Address & Library Name b. Add “ID” to identify the file you wish to update c. Fill in the 4 fields as per the screenshot below (Title, Full Name, Total & Invoice Number) from the Get Data from PDF Step. 10. On the “Yes”; Branch add a “Get Data from PDF Step” with the following Expected Keys – We then specify the following parameters, a. File Content: Sharepoint File Content Step b. Expected Keys: Title, Name, Invoice Number & Grand Total 11. Then add a new step “Update File Properties” a. Enter the site Address & Library Name b. Add “ID” to identify the file you wish to update c. Fill in the 4 fields as per the screenshot below (Title, Full Name, Total & Invoice Number) from the Get Data from PDF Step. 12. The whole flow should look something like this, 13. Once the flow runs you should see that all the named value pairs have been populated into custom Metadata fields as per the screenshot below.

Categories

Archive

Share Post

Related Posts

We recently announced that Aquaforest is now part of PSPDFKit. We are very excited about this news! Coming together with PSPDFKit will help us…
Power Automate is a very powerful tool, making tasks and problems that would often require propriety solutions possible for your average user. Aquaforest has…