On Tue 22 Jul 2025 at 10:14:37 (-0500), Richard Owlett wrote:
On 7/20/25 5:52 AM, Richard Owlett wrote:
I'm running Debian 12.8.
I have a 100+ page PDF document.
I wish to extract 2 of those pages, each to their own PDF file.
[ … ]
I should have put more "em-FAY-sis" on my goal for this thread being
learning how to extract specific pages of a large PDF document.[1] I
had not fully appreciated how graphically oriented the PDF format is.
The sub-goal being to perceive the the byte level structure of *that*
page in order to extract the semantic content perceived by a human. I
would then edit/reformat the content to be *useful* to a different
target audience.
It's very simple to burst a document into individual pages with pdftk:
$ pdftk document.pdf burst
$