During the last week I have had to use a PDF manipulation tool called PDFTK. The first instance was to strip or replace pdf metadata (for work) and the second was to collate scanned pages from my printer at home - it even supports the even pages in reverse format!

My wife and I are trying to create a virtual filing system in Evernote.

(From their website) PDFtk Server can:

Merge PDF Documents or Collate PDF Page Scans

Split PDF Pages into a New Document

Rotate PDF Documents or Pages

Decrypt Input as Necessary (Password Required)

Encrypt Output as Desired

Fill PDF Forms with X/FDF Data and/or Flatten Forms

Generate FDF Data Stencils from PDF Forms

Apply a Background Watermark or a Foreground Stamp

Report PDF Metrics, Bookmarks and Metadata

Add/Update PDF Bookmarks or Metadata

Attach Files to PDF Pages or the PDF Document

Unpack PDF Attachments

Burst a PDF Document into Single Pages

Uncompress and Re-Compress Page Streams

Repair Corrupted PDF (Where Possible)

PDFtk Server does not require Adobe Acrobat or Reader, and it runs on Windows, Mac OS X and Linux.

So I have a pretty standard scanner (an EPSON Artisan 837), it has a document feeder but only scans one side at a time, so when you run it through the first time you have odd pages, turn the bundle over, scan it and you have a bundle of even pages (in reverse order). This simple command collates them:

pdftk A=odd_pages.pdf B=even_pages.pdf shuffle A Bend-1 output collated_pages.pdf

if the even pages are in normal order, it's even easier:

pdftk A=odd_pages.pdf B=even_pages.pdf shuffle A B output collated_pages.pdf

Question: Does anyone out there know if there is a python library for PDFTK (or equivalent)?

References: https://www.pdflabs.com/blog/how-to-collate-even-odd-scanned-pages/ https://www.pdflabs.com/tools/pdftk-server/