During the last week I have had to use a PDF manipulation tool called PDFTK. The first instance was to strip or replace pdf metadata (for work) and the second was to collate scanned pages from my printer at home - it even supports the even pages in reverse format!
My wife and I are trying to create a virtual filing system in Evernote.
(From their website) PDFtk Server can:
Merge PDF Documents or Collate PDF Page Scans
Split PDF Pages into a New Document
Rotate PDF Documents or Pages
Decrypt Input as Necessary (Password Required)
Encrypt Output as Desired
Fill PDF Forms with X/FDF Data and/or Flatten Forms
Generate FDF Data Stencils from PDF Forms
Apply a Background Watermark or a Foreground Stamp
Report PDF Metrics, Bookmarks and Metadata
Add/Update PDF Bookmarks or Metadata
Attach Files to PDF Pages or the PDF Document
Unpack PDF Attachments
Burst a PDF Document into Single Pages
Uncompress and Re-Compress Page Streams
Repair Corrupted PDF (Where Possible)
PDFtk Server does not require Adobe Acrobat or Reader, and it runs on Windows, Mac OS X and Linux.
So I have a pretty standard scanner (an EPSON Artisan 837), it has a document feeder but only scans one side at a time, so when you run it through the first time you have odd pages, turn the bundle over, scan it and you have a bundle of even pages (in reverse order). This simple command collates them:
pdftk A=odd_pages.pdf B=even_pages.pdf shuffle A Bend-1 output collated_pages.pdf
if the even pages are in normal order, it's even easier:
pdftk A=odd_pages.pdf B=even_pages.pdf shuffle A B output collated_pages.pdf
Question: Does anyone out there know if there is a python library for PDFTK (or equivalent)?
References: https://www.pdflabs.com/blog/how-to-collate-even-odd-scanned-pages/ https://www.pdflabs.com/tools/pdftk-server/
