[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: How to rotate then save a PDF document?



(I had to repost this with smaller attachments.)

On Tue 11 Jan 2022 at 10:00:06 (-0600), Richard Owlett wrote:
> On 01/11/2022 09:27 AM, Alexander V. Makartsev wrote:
> > On 11.01.2022 19:37, Richard Owlett wrote:
> > > I use MATE and thus use Atril as viewer.
> > > Typically I have no need to modify PDF documents.
> > > I received a reading a long reading list which needs to be
> > > rotated left to be read. Atril rotates it but does not save it
> > > as rotated.
> > > 
> > > What's the simplest tool to permanently rotate that specific document?
> > > TIA
> > I'd go with GIMP.
> > Simply open any .pdf file and use Transform function ( Image >
> > Transform  > Rotate... ).
> > After that Export ( File > Export As ) the edited document as a
> > new file and check the results in .pdf viewer.
> > GIMP will also handle multi-paged .pdf documents just fine.
> 
> Worked like a charm. I tend to think of PDF as just another text format.
> I date from Teletype and Decwriter era ;}

I can't see any attraction in using Gimp, because that rotates the
image on the page, whereas what you want is to rotate the pages.

I took a random PDF downloaded last year: historyofmoderne0000sche.pdf
and ran pdftk on it to get these timings and files:

$ time pdftk /tmp/historyofmoderne0000sche.pdf cat 1-endeast output /tmp/pdftk-east.pdf

real    0m1.569s
user    0m2.285s
sys     0m0.634s

$ ls -Glg /tmp/historyofmoderne0000sche.pdf /tmp/pdftk-east.pdf
-rw-r----- 1 16779995 Feb 29  2020 /tmp/historyofmoderne0000sche.pdf
-rw-r----- 1 16242357 Jan 11 17:14 /tmp/pdftk-east.pdf

$ pdfinfo /tmp/historyofmoderne0000sche.pdf | grep -v ' no$'
Title:          History of modern Europe
Author:         Schevill, Ferdinand, 1868-1954
Creator:        Internet Archive
Producer:       Recoded by LuraDocument PDF v2.68
CreationDate:   Sat Feb 29 08:26:54 2020 CST
ModDate:        Sat Feb 29 09:36:18 2020 CST
Tagged:         yes
Form:           none
Pages:          512
Page size:      408 x 639 pts
Page rot:       0
File size:      16779995 bytes
Optimized:      yes
PDF version:    1.5

$ pdfinfo /tmp/pdftk-east.pdf | grep -v ' no$'
Creator:        pdftk 3.0.2 - www.pdftk.com
Producer:       itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate:   Tue Jan 11 17:14:05 2022 CST
ModDate:        Tue Jan 11 17:14:05 2022 CST
Form:           none
Pages:          512
Page size:      408 x 639 pts
Page rot:       90                ← note the rotation here
File size:      16242357 bytes
PDF version:    1.5
$ 

Then I followed the instructions above: I opened the PDF with
File → Open, and a little window appeared that had tiles
being added at the top, and an Import button below, which expected
to be pressed. Under the window, it said "All 512 pages selected".

When I pressed Import, top showed a file-pdf-load process running
at ~97%, and it consumed three minutes of CPU. Eventually the
progress bar crawled across, and the first page was displayed in
the main window.

As instructed above, I selected Image → Transform → Rotate and
90° clockwise. A progress bar swept across under the image in
a few seconds. I then selected File → Export As, added rot- to
the beginning of the filename, and pressed Export. A file-pdf-save
process was displayed by top for about 30 seconds, whereupon the
output file appeared in /tmp. In top, gimp and file-pdf-save
were using ~180% of CPU in total.

However, although the file was huge, it only displayed a single page,
confirmed by pdfinfo. So I repeated the Export command, but selected
an option to export Layers as Pages. This gave me a multipage PDF,
but the pages were backwards, so I repeated Export again, this time
also adding Reverse the Pages Order.

$ ls -Glg /tmp/historyofmoderne0000sche.pdf /tmp/rot-historyofmoderne0000sche.pdf
-rw-r----- 1  16779995 Feb 29  2020 /tmp/historyofmoderne0000sche.pdf
-rw-r----- 1 142836917 Jan 11 17:31 /tmp/rot-historyofmoderne0000sche.pdf

$ pdfinfo /tmp/rot-historyofmoderne0000sche.pdf | grep -v ' no$'
Producer:       cairo 1.16.0 (https://cairographics.org)
CreationDate:   Tue Jan 11 17:31:06 2022 CST
Form:           none
Pages:          512
Page size:      638.64 x 407.52 pts
Page rot:       0
File size:      142836917 bytes
PDF version:    1.5
$ 

The corresponding timings for Gimp were:

real    n/a (with all the repeating)
user    6m13.760s
sys     0m7.833s

I got this timing down a bit, by repeating the entire process above,
without exporting three times:

user    4m8.129s
sys     0m5.077s

So, about 100 times slower than pdftk.

Then I inspected the results at 1200% magnification, tiny fragments
of which are attached. You can see the problem.

Finally, using a GUI doesn't scale well. One could rotate any number
of documents with a single line of commands using pdftk, and
presumably with Curt's similar qpdf method as well (untested as not
installed). Clicking one's way round a GUI can't compete.

Cheers,
David.

Attachment: original.png
Description: PNG image

Attachment: pdftk.png
Description: PNG image

Attachment: gimp.png
Description: PNG image


Reply to: