Convert any file to PDF Configuration Options

Convert any file to PDF Configuration Options

There are three different ways of producing MS Office to PDF,

  1. MS Office Native Conversion *recommended option, for most scenarios
  2. MS Office Direct Print
  3. MS Office Extended Print

In most cases, we would recommend using the MS Office Native conversion as it’s the most reliable & probably has the all the features that are required for most scenarios. There are however some scenario’s where it may be more appropriate to use one of the other two options. Continue to read below to understand the history and reasons of using each option.

Understanding the “Convert any file to PDF” Configuration file

Autobahn DX and PDF Junction provides a step named “Convert any file to PDF” (MS Office Direct Print ) that converts most windows documents to PDF, over 50% of our customers use this step to convert Microsoft Office documents to PDF files.

This step uses a virtual printer to print files to a PDF file instead of a physical printer. Due to the printer and Microsoft office automation, there are a lot of technical and security Limitations. Thus, we have introduced different methods of converting files to PDF so that our users can have a robust solution to their requirements.

We have provided a configuration file to allow users switch between these various methods. This blog will provide a detailed explanation of the different configuration options provided by Aquaforest.

Overview of the Different Conversions Methods

Aquaforest uses three different methods to convert office files to PDF files, these methods are explained in the section below.

MS Office Native Conversion

This step uses the “Save as PDF” option in MS Office to save the PDF file directly. This requires Microsoft Office 2007 or newer. Microsoft Office 2007 requires the free “Save as PDF or XPS” add-in for Office 2007 to be installed. We recommend you use the 2010 and new versions.

If you are not interested in changing properties like Image Compression, Image Downsizing, Font embedding and you are ok with the default PDF/A output from your version of office, we recommend using this approach.

MS Office Direct Print

This method uses the BCL ‘easyPDF SDK x’ printer to print out PDF files directly from Microsoft Office. This is the recommended option for producing PDF/A files and PDF files with font embedding.

MS Office Extended Print

In this method, Office automation is used to create an intermediate XPS file, which is then printed with automatic hyperlink extraction. This approach has some technical differences to the first method that allows the users to use the software in a different way. This requires Microsoft Office 2007 or newer. Microsoft Office 2007 requires the free “Save as PDF or XPS” add-in for Office 2007 to be installed.

If you are performing conversions in server environments e.g., via IIS, Windows Services, Session 0 and ASP, this method does not require an interactive user to be logged on for server conversions to succeed, thus we recommend this setting for server environments.

Comparison of the features provided by the different methods

The table below shows you the features available in the different methods.

Property Direct Print Extended Print Native Conversion
Bookmark depth
Convert Hyperlinks
Convert Bookmarks (Word)
IncludeDocumentMarkups(Track Changes Word)
Paper Orientation
Paper Size
Margin
PDFA1b Depends on the version of Microsoft Office
PDFX1a
PDFX3
Image Compression
Image Downsizing
Font Embedding
Print All Sheets(Excel)
Include Document Properties
Fit to Page (Excel)
MSG Files
Frame Slides (PowerPoint)
Print Color Type (PowerPoint)
Output Type (PowerPoint)
Handout Order (PowerPoint)
Print Graphics (Pub)

In-depth Look at the Configuration file

The configuration file can be found in the following locations:

  • Autobahn DX: “Autobahn DX Installation directory \pj\bin\topdf.exe.config”
  • PDF Junction: “PDF Junction Installation directory \bin\topdf.exe.config”

Below are the contents of the configuration file, to use a method for a file type, just write the extension in the value property of the conversion Method as shown below.

Note: Do not write an extension in more than one method


<!– Extension Mapping –>

<add key=”AutoExtension” value=””/>

<add key=”AutoExtensionEx” value=”.jpeg.jpg.txt”/>

<add key=”AutoExtensionOpenNative” value=””/>

<add key=”AutoExtensionNative” value=””/>

<add key=”WordExtension” value=””/>

<add key=”WordExtensionEx” value=”.docx.doc.rtf”/>

<add key=”ExcelExtension” value=”.xls.xlsx”/>

<add key=”ExcelExtensionEx” value=””/>

<add key=”PowerPointExtension” value=””/>

<add key=”PowerPointExtensionEx” value=”.ppt.pptx”/>

<add key=”VisioExtension” value=””/>

<add key=”VisioExtensionEx” value=”.vsd”/>

<add key=”IEExtension” value=”.xml”/>

<add key=”PublisherExtension” value=””/>

<add key=”PublisherExtensionEx” value=”.pub.puz”/>

<add key=”IEExtendedExtension” value=”.mht”/>

<add key=”HTMLExtension” value=”.htm.html”/>

<add key=”OutlookExtension” value=”.msg”/>

<add key=”OpenOfficeExtension” value=”.odt.swx.wpd.ods.sxc.odp.sxi.odg.sxd”/>

<add key=”ExcludedExtensions” value=”.zip.exe.pps.ps.chm”/>

<add key=”PDFExtension” value=”.pdf”/>

<!–Native–>

<add key=”NativeOfficePDF” value=”true”/>


Mappings Between Configuration File and the Conversion Methods

The table below maps the configuration file with the Conversion Methods discussed earlier.

Note: The Configuration options ending with “Ex” work with the “NativeOfficePDF” config option to select a suitable conversion method.

Configuration Conversion Method File Types Comment
AutoExtension Direct Print All file types This option works with any file extension that has an application that is compatible with the BCL printer
AutoExtensionEx Extended Print All file types
AutoExtensionNative Native Conversion MS Office files
OpenOfficeExtension Direct Print Open Office files This option works with all files that can be opened in open office.
AutoExtensionOpenNative Native Conversion Open Office files
WordExtension Direct Print .doc, .docx, .rtf…(All files that can be opened in MS Word)
WordExtensionEx
NativeOfficePDF=true
Native Conversion
WordExtensionEx
NativeOfficePDF=false
Extended Print
ExcelExtension Direct Print .xls, .csv, .xlsx…(All files that open in excel)
ExcelExtensionEx
NativeOfficePDF=true
Native Conversion
ExcelExtensionEx
NativeOfficePDF=false
Extended Print
PowerPointExtension Direct Print .ppt, .pptx (All power point files)
PowerPointExtensionEx
NativeOfficePDF=true
Native Conversion
PowerPointExtensionEx
NativeOfficePDF=false
Extended Print
VisioExtension Direct Print .vsd
VisioExtensionEx
NativeOfficePDF=true
Native Conversion
VisioExtensionEx
NativeOfficePDF=false
Extended Print
IEExtension Direct Print .html, .xml, .mhtAll IE files.
IEExtendedExtension Extended Print
HTMLExtension Direct Print .htm, .html
PublisherExtension Direct Print .pub, .puz
PublisherExtensionEx
NativeOfficePDF=true
Native Conversion
PublisherExtensionEx
NativeOfficePDF=false
Extended Print
OutlookExtension Direct Print .msg files
PDFExtension PDF files We usually convert PDF attachments if you select the option
ExcludedExtensions others Skips all the extensions present here.

The following two tabs change content below.
Neil Pitman founded Aquaforest Limited in 2001 and is the chief architect for the company’s PDF and OCR software products used by thousands of organizations ranging from NASA to the Dutch Ministerie van Justitie. Neil has 30 years’ experience in the software industry in the UK and USA in the areas of database systems, document management and software development tools and has served on the IDT committees of the British Standards Institute (BSI) and was a co-author of the BSI’s 2007 publication on the Long Term Preservation of Digital Documents.

Latest posts by Neil Pitman (see all)