PDFDocument

Summary

Allows for the management of PDF documents, including facilities for merging and deleting pages, setting document open behavior, and creating or changing document security settings.

Discussion

PDFDocumentOpen and PDFDocumentCreate are two functions that provide a reference to a PDFDocument object.

One common scenario for creating new PDF files is for the creation of a map book. The steps typically involve creating a new PDFDocument object, appending content from existing PDF files, and saving the PDFDocument object to disk. Another common scenario is to modify exisiting PDF file contents or properties. Once a PDFDocument is referenced, you can appendPages, insertPages, or deletePages as well as use the updateDocProperties and updateDocSecurity methods to modify PDF file settings.

The deletePages method is useful for swapping out only the pages that have been modified. It may take a long time to process dozens of pages. If only a relative few have been modified, it is faster to delete only those pages, then insert the newly updated pages using the insertPages method.

Currently, when using Python to set PDF security on a document, it is limited to RC4 encryption. If you set PDF security in ArcGIS Pro, it is limited to AES 256-bit encryption. This means that if you are managing PDF documents using Python, you are limited to working with only PDF documents with no security or those documents that only use RC4 encryption.

Properties

PropertyExplanationData Type
pageCount
(Read Only)

Returns an integer that represents the total number of pages in the PDF document.

Long

Method Overview

MethodExplanation
appendPages (pdf_path, {input_pdf_password})

Appends one PDF document to the end of another.

deletePages (page_range)

Provides the ability to delete one or multiple pages within an existing PDF document.

insertPages (pdf_path, {before_page_number}, {input_pdf_password})

Allows inserting the contents of one PDF document at the beginning or in between the pages of another PDFDocument.

saveAndClose ()

Saves any changes made to the currently referenced PDFDocument.

updateDocProperties ({pdf_title}, {pdf_author}, {pdf_subject}, {pdf_keywords}, {pdf_open_view}, {pdf_layout})

Updates the PDF metadata. You can also use this method to set behaviors that will trigger when the document is opened in Adobe Reader or Adobe Acrobat, such as the initial view mode and the page thumbnails view.

updateDocSecurity (new_master_password, {new_user_password}, {encryption})

Sets password, encryption, and security restrictions on a PDF.

Methods

appendPages (pdf_path, {input_pdf_password})
ParameterExplanationData Type
pdf_path

A string that includes the location and name of the input PDF document to be appended.

String
input_pdf_password

A string that defines the master password to a protected file. It must be a master password; a user password will not work.

(The default value is None)

String

When appending secured PDF documents, where each have different security settings, the output settings will be based on the primary document to which pages are being appended. For example, if the document that is being appended to does not have password information saved, but the appended pages do, the resulting document will not have password information saved.

To add pages to the beginning of the current PDF document, use insertPages instead.

deletePages (page_range)
ParameterExplanationData Type
page_range

A string that defines the page or pages to be deleted. Delete a single page by passing in a single value as a string (for example, "3"). Multiple pages can be deleted using a comma between each value (for example, "3, 5, 7"). Ranges can also be applied (for example, "1, 3, 5-12").

String

It is important to keep track of the pages that are being deleted because, each time pages are deleted, the internal PDF page numbers are automatically adjusted. For example, page 3 becomes page 2 immediately after page 1 or page 2 are deleted. If page 1 and page 2 are deleted, page 3 becomes page 1. You need to consider this if you are using deletePages and then immediately using insertPages along with a before_page_number value.

insertPages (pdf_path, {before_page_number}, {input_pdf_password})
ParameterExplanationData Type
pdf_path

A string that includes the location and name of the input PDF document to be inserted.

String
before_page_number

An integer that defines a page number in the currently referenced PDFDocument before which the new pages will be inserted. For example, if the before_page_value is 1, the inserted page will be inserted before all pages.

(The default value is 1)

Integer
input_pdf_password

A string that defines the master password to a protected file. It must be a master password; a user password will not work.

(The default value is None)

String

When inserting secured PDF documents that have different security settings, the output settings will be based on the primary document into which pages are being inserted. For example, if the document into which pages are being inserted does not have password information saved, but the inserted pages do, the resulting document will not have password information saved.

To add pages to the end of the current PDF document, use appendPages instead.

saveAndClose ()

The saveAndClose method must be used for changes to be maintained. If a script exits before saveAndClose is executed, changes will not be saved. If you are creating a new file using PDFDocumentCreate, the file won't appear on disk until saveAndClose is executed.

updateDocProperties ({pdf_title}, {pdf_author}, {pdf_subject}, {pdf_keywords}, {pdf_open_view}, {pdf_layout})
ParameterExplanationData Type
pdf_title

The document title. This is a PDF metadata property.

(The default value is None)

String
pdf_author

The document author. This is a PDF metadata property.

(The default value is None)

String
pdf_subject

The document subject. This is a PDF metadata property.

(The default value is None)

String
pdf_keywords

The document keywords. This is a PDF metadata property.

(The default value is None)

String
pdf_open_view

Specifies the Adobe Reader view mode that will be used.

  • VIEWER_DEFAULTThe application user preference when opening the file will be used.
  • USE_NONEOnly the document will display. No other panels will display.
  • USE_THUMBSThe document and the Pages panel will display.
  • USE_BOOKMARKSThe document and the Bookmarks panel will display.
  • FULL_SCREENThe document will display in full-screen viewing mode.
  • LAYERSThe document and the Layers panel will display.
  • ATTACHMENTThe document and the Attachments panel will display.

(The default value is USE_THUMBS)

String
pdf_layout

Specifies the Adobe Reader layout mode that will be used.

  • DONT_CAREThe application user preference when opening the file will be used.
  • SINGLE_PAGESingle-page mode will be used.
  • ONE_COLUMNOne-column continuous mode will be used.
  • TWO_COLUMN_LEFTTwo-column continuous mode with the first page on the left will be used.
  • TWO_COLUMN_RIGHTTwo-column continuous mode with the first page on the right will be used.
  • TWO_PAGE_LEFTTwo-page mode left will be used.
  • TWO_PAGE_RIGHTTwo-page mode right will be used.

(The default value is SINGLE_PAGE)

String

A pdf_open_view setting of FULL_SCREEN will prompt a warning about full-screen mode when the PDF is opened. Setting pdf_open_view to a different option will not clear this setting unless pdf_open_view is set to USE_NONE.

updateDocSecurity (new_master_password, {new_user_password}, {encryption})
ParameterExplanationData Type
new_master_password

The master document password.

String
new_user_password

The user password needed to open the PDF for viewing.

(The default value is None)

String
encryption

The encryption technique that will be used for the PDF.

  • AES-128128-bit AES encryption (Acrobat 7.0 compatible) will be used.
  • AES-256-R5256-bit AES encryption with R5 encoding (Acrobat 9.0 compatible) will be used.
  • AES-256256-bit AES encryption (Acrobat X compatible) will be used.
  • RC4128-bit RC4 encryption (Acrobat 5.0 compatible) will be used.
    Legacy:

    RC4 is only included for compatibility. AES encryption is recommended. RC4 will be deprecated in a future release.

(The default value is AES-256)

String
Tip:

A password on a secured PDF can be removed by setting the new_master_password or new_user_password properties to empty strings.

Code sample

PDFDocument example 1

This script creates a new PDF document, appends the contents of three separate PDF documents, and saves the resulting PDF file.

import arcpy, os

#Set file name and remove if it already exists
pdfPath = r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf"
if os.path.exists(pdfPath):
    os.remove(pdfPath)

#Create the file and append pages
pdfDoc = arcpy.mp.PDFDocumentCreate(pdfPath)
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\Title.pdf")
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\MapPages.pdf")
pdfDoc.appendPages(r"C:\Projects\YosemiteNP\ContactInfo.pdf")

#Commit changes and delete variable reference
pdfDoc.saveAndClose()
del pdfDoc
PDFDocument example 2

The following script modifies the PDF document metadata properties and sets the style in which the document opens.

import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf")
pdfDoc.updateDocProperties(pdf_title="Yosemite Main Attrations Map Book",
                           pdf_author="Esri",
                           pdf_subject="Main Attractions Map Book",
                           pdf_keywords="Yosemite; Map Books; Attractions",
                           pdf_open_view="USE_THUMBS",
                           pdf_layout="SINGLE_PAGE")
pdfDoc.saveAndClose()
del pdfDoc
PDFDocument example 3

The following script sets the user_password and master_password, encrypts the PDF using RC4 compression, and requires a password when the document opens. Be sure to read the secured PDF limitations in the class description above.

import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf")
pdfDoc.updateDocSecurity("master", "user", "RC4")
pdfDoc.saveAndClose()
del pdfDoc
PDFDocument example 4

The following script replaces a total of four pages in an existing PDF using deletePages followed by insertPages. Note how the new page 3 was inserted before the current page 3, which was really page 4 before the original page 3 was removed. The same applies to the range of pages 5–7. Be sure to read the secured PDF limitations in the class description above.

import arcpy
pdfDoc = arcpy.mp.PDFDocumentOpen(r"C:\Projects\YosemiteNP\AttractionsMapBook.pdf", "master")
pdfDoc.deletePages("3, 5-7")
pdfDoc.insertPages(r"C:\Projects\Yosemite\NewPage3.pdf", 3, "master")
pdfDoc.insertPages(r"C:\Projects\Yosemite\NewPages5-7.pdf", 5, "master")
pdfDoc.saveAndClose()
del pdfDoc