| pdftk 1.44 a Handy Tool for Manipulating PDF Documents |
| Copyright (C) 2003-10, Sid Steward - Please Visit: www.pdftk.com |
| This is free software; see the source code for copying conditions. There is |
| NO warranty, not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. |
| |
| SYNOPSIS |
| pdftk <input PDF files | - | PROMPT> |
| [ input_pw <input PDF owner passwords | PROMPT> ] |
| [ <operation> <operation arguments> ] |
| [ output <output filename | - | PROMPT> ] |
| [ encrypt_40bit | encrypt_128bit ] |
| [ allow <permissions> ] |
| [ owner_pw <owner password | PROMPT> ] |
| [ user_pw <user password | PROMPT> ] |
| [ flatten ] [ compress | uncompress ] |
| [ keep_first_id | keep_final_id ] [ drop_xfa ] |
| [ verbose ] [ dont_ask | do_ask ] |
| Where: |
| <operation> may be empty, or: |
| [ cat | shuffle | burst | |
| generate_fdf | fill_form | |
| background | multibackground | |
| stamp | multistamp | |
| dump_data | dump_data_utf8 | |
| dump_data_fields | dump_data_fields_utf8 | |
| update_info | update_info_utf8 | |
| attach_files | unpack_files ] |
| |
| For Complete Help: pdftk --help |
| |
| DESCRIPTION |
| If PDF is electronic paper, then pdftk is an electronic staple-remover, |
| hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a |
| simple tool for doing everyday things with PDF documents. Use it to: |
| |
| * Merge PDF Documents or Collate PDF Page Scans |
| * Split PDF Pages into a New Document |
| * Rotate PDF Documents or Pages |
| * Decrypt Input as Necessary (Password Required) |
| * Encrypt Output as Desired |
| * Fill PDF Forms with X/FDF Data and/or Flatten Forms |
| * Generate FDF Data Stencils from PDF Forms |
| * Apply a Background Watermark or a Foreground Stamp |
| * Report PDF Metrics such as Metadata and Bookmarks |
| * Update PDF Metadata |
| * Attach Files to PDF Pages or the PDF Document |
| * Unpack PDF Attachments |
| * Burst a PDF Document into Single Pages |
| * Uncompress and Re-Compress Page Streams |
| * Repair Corrupted PDF (Where Possible) |
| |
| OPTIONS |
| A summary of options is included below. |
| |
| --help, -h |
| Show summary of options. |
| |
| <input PDF files | - | PROMPT> |
| A list of the input PDF files. If you plan to combine these PDFs |
| (without using handles) then list files in the order you want |
| them combined. Use - to pass a single PDF into pdftk via stdin. |
| Input files can be associated with handles, where a handle is a |
| single, upper-case letter: |
| |
| <input PDF handle>=<input PDF filename> |
| |
| Handles are often omitted. They are useful when specifying PDF |
| passwords or page ranges, later. |
| |
| For example: A=input1.pdf B=input2.pdf |
| |
| [input_pw <input PDF owner passwords | PROMPT>] |
| Input PDF owner passwords, if necessary, are associated with |
| files by using their handles: |
| |
| <input PDF handle>=<input PDF file owner password> |
| |
| If handles are not given, then passwords are associated with |
| input files by order. |
| |
| Most pdftk features require that encrypted input PDF are accom- |
| panied by the ~owner~ password. If the input PDF has no owner |
| password, then the user password must be given, instead. If the |
| input PDF has no passwords, then no password should be given. |
| |
| When running in do_ask mode, pdftk will prompt you for a pass- |
| word if the supplied password is incorrect or none was given. |
| |
| [<operation> <operation arguments>] |
| If this optional argument is omitted, then pdftk runs in 'fil- |
| ter' mode. Filter mode takes only one PDF input and creates a |
| new PDF after applying all of the output options, like encryp- |
| tion and compression. |
| |
| Available operations are: cat, shuffle, burst, generate_fdf, |
| fill_form, background, multibackground, stamp, multistamp, |
| dump_data, dump_data_utf8, dump_data_fields, |
| dump_data_fields_utf8, update_info, update_info_utf8, |
| attach_files, unpack_files. Some operations takes additional |
| arguments, described below. |
| |
| cat [<page ranges>] |
| Catenates pages from input PDFs to create a new PDF. Page |
| order in the new PDF is specified by the order of the given |
| page ranges. Page ranges are described like this: |
| |
| <input PDF handle>[<begin page number>[-<end page num- |
| ber>[<qualifier>]]][<page rotation>] |
| |
| Where the handle identifies one of the input PDF files, and |
| the beginning and ending page numbers are one-based refer- |
| ences to pages in the PDF file, and the qualifier can be even |
| or odd, and the page rotation can be N, S, E, W, L, R, or D. |
| |
| If the handle is omitted from the page range, then the pages |
| are taken from the first input PDF. |
| |
| The even qualifier causes pdftk to use only the even-numbered |
| PDF pages, so 1-6even yields pages 2, 4 and 6 in that order. |
| 6-1even yields pages 6, 4 and 2 in that order. |
| |
| The odd qualifier works similarly to the even. |
| |
| The page rotation setting can cause pdftk to rotate pages and |
| documents. Each option sets the page rotation as follows (in |
| degrees): N: 0, E: 90, S: 180, W: 270, L: -90, R: +90, D: |
| +180. L, R, and D make relative adjustments to a page's rota- |
| tion. |
| |
| If no arguments are passed to cat, then pdftk combines all |
| input PDFs in the order they were given to create the output. |
| |
| NOTES: |
| * <end page number> may be less than <begin page number>. |
| * The keyword end may be used to reference the final page of |
| a document instead of a page number. |
| * Reference a single page by omitting the ending page number. |
| * The handle may be used alone to represent the entire PDF |
| document, e.g., B1-end is the same as B. |
| |
| Page Range Examples w/o Handles: |
| 1-endE - rotate entire document 90 degrees |
| 5 11 20 - take single pages from input PDF |
| 5-25oddW - take odd pages in range, rotate 90 degrees |
| 6-1 - reverse pages in range from input PDF |
| |
| Page Range Examples Using Handles: |
| Say A=in1.pdf B=in2.pdf, then: |
| A1-21 - take range from in1.pdf |
| Bend-1odd - take all odd pages from in2.pdf in reverse order |
| A72 - take a single page from in1.pdf |
| A1-21 Beven A72 - assemble pages from both in1.pdf and |
| in2.pdf |
| AW - rotate entire in1.pdf document 90 degrees |
| B - use all of in2.pdf |
| A2-30evenL - take the even pages from the range, remove 90 |
| degrees from each page's rotation |
| A A - catenate in1.pdf with in1.pdf |
| AevenW AoddE - apply rotations to even pages, odd pages from |
| in1.pdf |
| AW BW BD - catenate rotated documents |
| |
| shuffle [<page ranges>] |
| Collates pages from input PDFs to create a new PDF. Works |
| like the cat operation except that it takes one page at a |
| time from each page range to assemble the output PDF. If one |
| range runs out of pages, it continues with the remaining |
| ranges. Ranges can use all of the features described above |
| for cat, like reverse page ranges, multiple ranges from a |
| single PDF, and page rotation. This feature was designed to |
| help collate PDF pages after scanning paper documents. |
| |
| burst Splits a single, input PDF document into individual pages. |
| Also creates a report named doc_data.txt which is the same as |
| the output from dump_data. If the output section is omitted, |
| then PDF pages are named: pg_%04d.pdf, e.g.: pg_0001.pdf, |
| pg_0002.pdf, etc. To name these pages yourself, supply a |
| printf-styled format string via the output section. For |
| example, if you want pages named: page_01.pdf, page_02.pdf, |
| etc., pass output page_%02d.pdf to pdftk. Encryption can be |
| applied to the output by appending output options such as |
| owner_pw, e.g.: |
| |
| pdftk in.pdf burst owner_pw foopass |
| |
| generate_fdf |
| Reads a single, input PDF file and generates an FDF file |
| suitable for fill_form out of it to the given output filename |
| or (if no output is given) to stdout. Does not create a new |
| PDF. |
| |
| fill_form <FDF data filename | XFDF data filename | - | PROMPT> |
| Fills the single input PDF's form fields with the data from |
| an FDF file, XFDF file or stdin. Enter the data filename |
| after fill_form, or use - to pass the data via stdin, like |
| so: |
| |
| pdftk form.pdf fill_form data.fdf output form.filled.pdf |
| |
| After filling a form, the form fields remain interactive |
| unless you also use the flatten output option. flatten merges |
| the form fields with the PDF pages. You can use flatten |
| alone, too, but only on a single PDF: |
| |
| pdftk form.pdf fill_form data.fdf output out.pdf flatten |
| |
| or: |
| |
| pdftk form.filled.pdf output out.pdf flatten |
| |
| If the input FDF file includes Rich Text formatted data in |
| addition to plain text, then the Rich Text data is packed |
| into the form fields as well as the plain text. Pdftk also |
| sets a flag that cues Acrobat/Reader to generate new field |
| appearances based on the Rich Text data. That way, when the |
| user opens the PDF, the viewer will create the Rich Text |
| fields on the spot. If the user's PDF viewer does not sup- |
| port Rich Text, then the user will see the plain text data |
| instead. If you flatten this form before Acrobat has a |
| chance to create (and save) new field appearances, then the |
| plain text field data is what you'll see. |
| |
| background <background PDF filename | - | PROMPT> |
| Applies a PDF watermark to the background of a single input |
| PDF. Pass the background PDF's filename after background |
| like so: |
| |
| pdftk in.pdf background back.pdf output out.pdf |
| |
| Pdftk uses only the first page from the background PDF and |
| applies it to every page of the input PDF. This page is |
| scaled and rotated as needed to fit the input page. You can |
| use - to pass a background PDF into pdftk via stdin. |
| |
| If the input PDF does not have a transparent background (such |
| as a PDF created from page scans) then the resulting back- |
| ground won't be visible -- use the stamp operation instead. |
| |
| multibackground <background PDF filename | - | PROMPT> |
| Same as the background operation, but applies each page of |
| the background PDF to the corresponding page of the input |
| PDF. If the input PDF has more pages than the stamp PDF, |
| then the final stamp page is repeated across these remaining |
| pages in the input PDF. |
| |
| stamp <stamp PDF filename | - | PROMPT> |
| This behaves just like the background operation except it |
| overlays the stamp PDF page on top of the input PDF docu- |
| ment's pages. This works best if the stamp PDF page has a |
| transparent background. |
| |
| multistamp <stamp PDF filename | - | PROMPT> |
| Same as the stamp operation, but applies each page of the |
| background PDF to the corresponding page of the input PDF. |
| If the input PDF has more pages than the stamp PDF, then the |
| final stamp page is repeated across these remaining pages in |
| the input PDF. |
| |
| dump_data |
| Reads a single, input PDF file and reports various statis- |
| tics, metadata, bookmarks (a/k/a outlines), and page labels |
| to the given output filename or (if no output is given) to |
| stdout. Non-ASCII characters are encoded as XML numerical |
| entities. Does not create a new PDF. |
| |
| dump_data_utf8 |
| Same as dump_data excepct that the output is encoded as |
| UTF-8. |
| |
| dump_data_fields |
| Reads a single, input PDF file and reports form field statis- |
| tics to the given output filename or (if no output is given) |
| to stdout. Non-ASCII characters are encoded as XML numerical |
| entities. Does not create a new PDF. |
| |
| dump_data_fields_utf8 |
| Same as dump_data_fields excepct that the output is encoded |
| as UTF-8. |
| |
| update_info <info data filename | - | PROMPT> |
| Changes the metadata stored in a single PDF's Info dictionary |
| to match the input data file. The input data file uses the |
| same syntax as the output from dump_data. Non-ASCII charac- |
| ters should be encoded as XML numerical entities. This does |
| not change the metadata stored in the PDF's XMP stream, if it |
| has one. For example: |
| |
| pdftk in.pdf update_info in.info output out.pdf |
| |
| update_info_utf8 <info data filename | - | PROMPT> |
| Same as update_info except that the input is encoded as |
| UTF-8. |
| |
| attach_files <attachment filenames | PROMPT> [to_page <page number | |
| PROMPT>] |
| Packs arbitrary files into a PDF using PDF's file attachment |
| features. More than one attachment may be listed after |
| attach_files. Attachments are added at the document level |
| unless the optional to_page option is given, in which case |
| the files are attached to the given page number (the first |
| page is 1, the final page is end). For example: |
| |
| pdftk in.pdf attach_files table1.html table2.html to_page 6 |
| output out.pdf |
| |
| unpack_files |
| Copies all of the attachments from the input PDF into the |
| current folder or to an output directory given after output. |
| For example: |
| |
| pdftk report.pdf unpack_files output ~/atts/ |
| |
| or, interactively: |
| |
| pdftk report.pdf unpack_files output PROMPT |
| |
| [output <output filename | - | PROMPT>] |
| The output PDF filename may not be set to the name of an input |
| filename. Use - to output to stdout. When using the dump_data |
| operation, use output to set the name of the output data file. |
| When using the unpack_files operation, use output to set the |
| name of an output directory. When using the burst operation, |
| you can use output to control the resulting PDF page filenames |
| (described above). |
| |
| [encrypt_40bit | encrypt_128bit] |
| If an output PDF user or owner password is given, output PDF |
| encryption strength defaults to 128 bits. This can be overrid- |
| den by specifying encrypt_40bit. |
| |
| [allow <permissions>] |
| Permissions are applied to the output PDF only if an encryption |
| strength is specified or an owner or user password is given. If |
| permissions are not specified, they default to 'none,' which |
| means all of the following features are disabled. |
| |
| The permissions section may include one or more of the following |
| features: |
| |
| Printing |
| Top Quality Printing |
| |
| DegradedPrinting |
| Lower Quality Printing |
| |
| ModifyContents |
| Also allows Assembly |
| |
| Assembly |
| |
| CopyContents |
| Also allows ScreenReaders |
| |
| ScreenReaders |
| |
| ModifyAnnotations |
| Also allows FillIn |
| |
| FillIn |
| |
| AllFeatures |
| Allows the user to perform all of the above, and top |
| quality printing. |
| |
| [owner_pw <owner password | PROMPT>] |
| |
| [user_pw <user password | PROMPT>] |
| If an encryption strength is given but no passwords are sup- |
| plied, then the owner and user passwords remain empty, which |
| means that the resulting PDF may be opened and its security |
| parameters altered by anybody. |
| |
| [compress | uncompress] |
| These are only useful when you want to edit PDF code in a text |
| editor like vim or emacs. Remove PDF page stream compression by |
| applying the uncompress filter. Use the compress filter to |
| restore compression. |
| |
| [flatten] |
| Use this option to merge an input PDF's interactive form fields |
| (and their data) with the PDF's pages. Only one input PDF may be |
| given. Sometimes used with the fill_form operation. |
| |
| [keep_first_id | keep_final_id] |
| When combining pages from multiple PDFs, use one of these |
| options to copy the document ID from either the first or final |
| input document into the new output PDF. Otherwise pdftk creates |
| a new document ID for the output PDF. When no operation is |
| given, pdftk always uses the ID from the (single) input PDF. |
| |
| [drop_xfa] |
| If your input PDF is a form created using Acrobat 7 or Adobe |
| Designer, then it probably has XFA data. Filling such a form |
| using pdftk yields a PDF with data that fails to display in |
| Acrobat 7 (and 6?). The workaround solution is to remove the |
| form's XFA data, either before you fill the form using pdftk or |
| at the time you fill the form. Using this option causes pdftk to |
| omit the XFA data from the output PDF form. |
| |
| This option is only useful when running pdftk on a single input |
| PDF. When assembling a PDF from multiple inputs using pdftk, |
| any XFA data in the input is automatically omitted. |
| |
| [verbose] |
| By default, pdftk runs quietly. Append verbose to the end and it |
| will speak up. |
| |
| [dont_ask | do_ask] |
| Depending on the compile-time settings (see ASK_ABOUT_WARNINGS), |
| pdftk might prompt you for further input when it encounters a |
| problem, such as a bad password. Override this default behavior |
| by adding dont_ask (so pdftk won't ask you what to do) or do_ask |
| (so pdftk will ask you what to do). |
| |
| When running in dont_ask mode, pdftk will over-write files with |
| its output without notice. |
| |
| EXAMPLES |
| Collate scanned pages |
| pdftk A=even.pdf B=odd.pdf shuffle A B output collated.pdf |
| or if odd.pdf is in reverse order: |
| pdftk A=even.pdf B=odd.pdf shuffle A Bend-1 output collated.pdf |
| |
| Decrypt a PDF |
| pdftk secured.pdf input_pw foopass output unsecured.pdf |
| |
| Encrypt a PDF using 128-bit strength (the default), withhold all per- |
| missions (the default) |
| pdftk 1.pdf output 1.128.pdf owner_pw foopass |
| |
| Same as above, except password 'baz' must also be used to open output |
| PDF |
| pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz |
| |
| Same as above, except printing is allowed (once the PDF is open) |
| pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing |
| |
| Join in1.pdf and in2.pdf into a new PDF, out1.pdf |
| pdftk in1.pdf in2.pdf cat output out1.pdf |
| or (using handles): |
| pdftk A=in1.pdf B=in2.pdf cat A B output out1.pdf |
| or (using wildcards): |
| pdftk *.pdf cat output combined.pdf |
| |
| Remove 'page 13' from in1.pdf to create out1.pdf |
| pdftk in.pdf cat 1-12 14-end output out1.pdf |
| or: |
| pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf |
| |
| Apply 40-bit encryption to output, revoking all permissions (the |
| default). Set the owner PW to 'foopass'. |
| pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass |
| |
| Join two files, one of which requires the password 'foopass'. The out- |
| put is not encrypted. |
| pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf |
| |
| Uncompress PDF page streams for editing the PDF in a text editor (e.g., |
| vim, emacs) |
| pdftk doc.pdf output doc.unc.pdf uncompress |
| |
| Repair a PDF's corrupted XREF table and stream lengths, if possible |
| pdftk broken.pdf output fixed.pdf |
| |
| Burst a single PDF document into pages and dump its data to |
| doc_data.txt |
| pdftk in.pdf burst |
| |
| Burst a single PDF document into encrypted pages. Allow low-quality |
| printing |
| pdftk in.pdf burst owner_pw foopass allow DegradedPrinting |
| |
| Write a report on PDF document metadata and bookmarks to report.txt |
| pdftk in.pdf dump_data output report.txt |
| |
| Rotate the first PDF page to 90 degrees clockwise |
| pdftk in.pdf cat 1E 2-end output out.pdf |
| |
| Rotate an entire PDF document to 180 degrees |
| pdftk in.pdf cat 1-endS output out.pdf |
| |
| NOTES |
| The pdftk home page permalink is: |
| http: |
| The easy-to-remember shortcut is: www.pdftk.com |
| |
| AUTHOR |
| Sid Steward (sid.steward at pdflabs dot com) maintains pdftk. Please |
| email him with questions or bug reports. Include pdftk in the subject |
| line to ensure successful delivery. Thank you.COPY |