PDFDocument

Summary

Allows for the management of PDF documents, including facilities for merging and deleting pages, setting document open behavior, and creating or changing document security settings.

Discussion

PDFDocumentOpen and PDFDocumentCreate are two functions that provide a reference to a PDFDocument object.

One common scenario for creating new PDF files is for the creation of a map book. The steps typically involve creating a new PDFDocument object, appending content from existing PDF files, and saving the PDFDocument object to disk. Another common scenario is to modify exisiting PDF file contents or properties. Once a PDFDocument is referenced, you can appendPages, insertPages, or deletePages as well as use the updateDocProperties and updateDocSecurity methods to modify PDF file settings.

The deletePages method is useful for swapping out only the pages that have been modified. It may take a long time to process dozens of pages. If only a relative few have been modified, it is faster to delete only those pages, then insert the newly updated pages using the insertPages method.

Currently, when using Python to set PDF security on a document, it is limited to RC4 encryption. If you set PDF security in ArcGIS Pro, it is limited to AES 256-bit encryption. This means that if you are managing PDF documents using Python, you are limited to working with only PDF documents with no security or those documents that only use RC4 encryption.

Properties

PropertyExplanationData Type
pageCount
(Read Only)

Returns an integer that represents the total number of pages in the PDF document.

Long

Method Overview

MethodExplanation
appendPages (pdf_path, {input_pdf_password})

Appends one PDF document to the end of another.

deletePages (page_range)

Provides the ability to delete one or multiple pages within an existing PDF document.

insertPages (pdf_path, {before_page_number}, {input_pdf_password})

Allows inserting the contents of one PDF document at the beginning or in between the pages of another PDFDocument.

saveAndClose ()

Saves any changes made to the currently referenced PDFDocument.

updateDocProperties ({pdf_title}, {pdf_author}, {pdf_subject}, {pdf_keywords}, {pdf_open_view}, {pdf_layout})

Allows updating of the PDF document metadata and can also set the certain behaviors that will trigger when the document is opened in Adobe Reader or Adobe Acrobat, such as the initial view mode and the page thumbnails view.

updateDocSecurity (new_master_password, {new_user_password}, {encryption})

Provides the mechanism that sets password, encryption, and security restrictions on PDF files.

Methods

appendPages (pdf_path, {input_pdf_password})
ParameterExplanationData Type
pdf_path

A string that includes the location and name of the input PDF document to be appended.

String
input_pdf_password

A string that defines the master password to a protected file. It must be a master password; a user password will not work.

(The default value is None)

String

When appending secured PDF documents, where each have different security settings, the output settings will be based on the primary document to which pages are being appended. For example, if the document that is being appended to does not have password information saved, but the appended pages do, the resulting document will not have password information saved.

To add pages to the beginning of the current PDF document, use insertPages instead.

deletePages (page_range)
ParameterExplanationData Type
page_range

A string that defines the page or pages to be deleted. Delete a single page by passing in a single value as a string (for example, "3"). Multiple pages can be deleted using a comma between each value (for example, "3, 5, 7"). Ranges can also be applied (for example, "1, 3, 5-12").

String

It is important to keep track of the pages that are being deleted because, each time pages are deleted, the internal PDF page numbers are automatically adjusted. For example, page 3 becomes page 2 immediately after page 1 or page 2 are deleted. If page 1 and page 2 are deleted, page 3 becomes page 1. You need to consider this if you are using deletePages and then immediately using insertPages along with a before_page_number value.

insertPages (pdf_path, {before_page_number}, {input_pdf_password})
ParameterExplanationData Type
pdf_path

A string that includes the location and name of the input PDF document to be inserted.

String
before_page_number

An integer that defines a page number in the currently referenced PDFDocument before which the new pages will be inserted. For example, if the before_page_value is 1, the inserted page will be inserted before all pages.

(The default value is 1)

Integer
input_pdf_password

A string that defines the master password to a protected file. It must be a master password; a user password will not work.

(The default value is None)

String

When inserting secured PDF documents that have different security settings, the output settings will be based on the primary document into which pages are being inserted. For example, if the document into which pages are being inserted does not have password information saved, but the inserted pages do, the resulting document will not have password information saved.

To add pages to the end of the current PDF document, use appendPages instead.

saveAndClose ()

The saveAndClose method must be used for changes to be maintained. If a script exits before saveAndClose is executed, changes will not be saved. If you are creating a new file using PDFDocumentCreate, the file won't appear on disk until saveAndClose is executed.

updateDocProperties ({pdf_title}, {pdf_author}, {pdf_subject}, {pdf_keywords}, {pdf_open_view}, {pdf_layout})
ParameterExplanationData Type
pdf_title

A string defining the document title, a PDF metadata property.

(The default value is None)

String
pdf_author

A string defining the document author, a PDF metadata property.

(The default value is None)

String
pdf_subject

A string defining the document subject, a PDF metadata property.

(The default value is None)

String
pdf_keywords

A string defining the document keywords, a PDF metadata property.

(The default value is None)

String
pdf_open_view

A string or number that will define the behavior to trigger when the PDF file is viewed. The default value is USETHUMBS, which will show the Adobe Reader Pages panel automatically when the PDF is opened.

  • VIEWER_DEFAULTUses the application user preference when opening the file
  • USE_NONEDisplays the document only; does not show other panels
  • USE_THUMBSDisplays the document plus the Pages panel
  • USE_BOOKMARKSDisplays the document plus the Bookmarks panel
  • FULL_SCREENDisplays the document in full-screen viewing mode
  • LAYERSDisplays the document plus the layers panel
  • ATTACHMENTDisplays the document plus the attachment panel

(The default value is USE_THUMBS)

String
pdf_layout

A string or number that will define the initial view mode to trigger when the PDF file is viewed.

  • DONT_CAREUses the application user preference when opening the file
  • SINGLE_PAGEUses single-page mode
  • ONE_COLUMNUses one-column continuous mode
  • TWO_COLUMN_LEFTUses two-column continuous mode with first page on left
  • TWO_COLUMN_RIGHTUses two-column continuous mode with first page on right
  • TWO_PAGE_LEFTUses two-page mode left
  • TWO_PAGE_RIGHTUses two-page mode right

(The default value is SINGLE_PAGE)

String

A pdf_open setting of FULL_SCREEN will prompt a warning about full-screen mode when the PDF is opened. Setting pdf_open to a different option will not clear this setting unless pdf_open is set to USE_NONE.

updateDocSecurity (new_master_password, {new_user_password}, {encryption})
ParameterExplanationData Type
new_master_password

A string that defines the master document password. This password is required for appending and inserting pages into a secured PDF.

String
new_user_password

A string that defines the user password needed to open the PDF document for viewing.

(The default value is None)

String
encryption

A string that defines the encryption technique used on the PDF. This is the only encryption type supported.

  • "RC4"Uses 128-bit RC4 encryption (Acrobat 5.0 compatible)

(The default value is RC4)

String

A password on a secured PDF document can be removed by setting the new_master_password and new_user_password properties to empty strings.

Code sample

PDFDocument example 1

This script creates a new PDF document, appends the contents of three separate PDF documents, and saves the resulting PDF file.

import arcpy, os

#Set file name and remove if it already exists
pdfPath = r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf"
if os.path.exists(pdfPath):
    os.remove(pdfPath)

#Create the file and append pages
pdfDoc = arcpy.mp.PDFDocumentCreate(pdfPath)
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\Title.pdf")
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\MapPages.pdf")
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\ContactInfo.pdf")

#Commit changes and delete variable reference
pdfDoc.saveAndClose()
del pdfDoc
PDFDocument example 2

The following script modifies the PDF document metadata properties and sets the style in which the document opens.

import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf")
pdfDoc.updateDocProperties(pdf_title="Yosemite Main Attrations Map Book",
                           pdf_author="Esri",
                           pdf_subject="Main Attractions Map Book",
                           pdf_keywords="Yosemite; Map Books; Attractions",
                           pdf_open_view="USE_THUMBS",
                           pdf_layout="SINGLE_PAGE")
pdfDoc.saveAndClose()
del pdfDoc
PDFDocument example 3

The following script sets the user_password and master_password, encrypts the PDF using RC4 compression, and requires a password when the document opens. Be sure to read the secured PDF limitations in the class description above.

import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf")
pdfDoc.updateDocSecurity("master", "user", "RC4", "OPEN")
pdfDoc.saveAndClose()
del pdfDoc
PDFDocument example 4

The following script replaces a total of four pages in an existing PDF using deletePages followed by insertPages. Note how the new page 3 was inserted before the current page 3, which was really page 4 before the original page 3 was removed. The same applies to the range of pages 5–7. Be sure to read the secured PDF limitations in the class description above.

import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf", "master")
pdfDoc.deletePages("3, 5-7")
pdfDoc.insertPages(r"C:\Projects\Yosemite\NewPage3.pdf", 3, "master")
pdfDoc.insertPages(r"C:\Projects\Yosemite\NewPages5-7.pdf", 5, "master")
pdfDoc.saveAndClose()
del pdfDoc