Re: [OT?] Attempting to extract tabular data from PDF -- approriate forum?
Hello Richard,
the PDF format is not suitable for structured data, what do you want
to do with it?
If you want to extract some pages from a big document, just print them
and choose to save to PDF,
If you want something that you can modify, use "pdftotext" which is
available in Debian in the "poppler-utils" package
This will work for you:
pdftotext -layout -f 116 -l 116 /tmp/TFP2021.pdf
Now let's talk radio!
I wanted to convert the band plan
https://www.iaru-r1.org/wp-content/uploads/2021/03/UHF-Bandplan.pdf
I tried different ways:
first I did a copy and paste in Libreoffice Writer, I got all the
contents, but the columns where gone as expected
then I did a copy and paste in Libreoffice Calc, but there isn't an
easy way to get the columns
finally I ran: pdftotext -layout -f 1 -l 2 UHF-Bandplan.pdf
and also in this case pdftotext is doing a better job than a simple
copy and paste, but it can't be easily read with a software so I
wonder if a machine-readable list of frequencies is already available
somewhere
--
73 de IU5HKX Daniele
Reply to: