[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ePub or mobi from LaTeX or PDF? (was PDF editors)



On Thu, 2024-06-27 at 10:02 +0200, Richard wrote:
You could try if Googles ML model "magika" can do a better job (available via pypi). Otherwise, what exactly does "file" or better "file -i" say? Worst case, you could open the files in a hex editor and google the first few bits. Chances are the format uses "magic bits", so the first few bits in hex are identical for all files of that format.

Am Do., 27. Juni 2024 um 06:33 Uhr schrieb Van Snyder <van.snyder@sbcglobal.net>:
I downloaded everything with the same base name as I sent -- a file and a directory. LibreOffice can't read any of it. Calibre can't read any of it, either in the download or in the mounted Kindle. "file" has no idea what any of the files are.


Thanks to Richard for the suggestion to try magika:

# find ./ -type f -exec ~/.local/bin/magika {} \;
Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.sdr/Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23bbd37f35b4b251f45c1bf42879a62d70.yjf: Unknown binary data (unknown) [Low-confidence model best-guess: JPEG image data (image), score=28]
Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.sdr/data/.pagination.cache/887fcb7c: Unknown binary data (unknown) [Low-confidence model best-guess: PE executable (executable), score=71]
Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.sdr/YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.mf: Generic text document (text) [Low-confidence model best-guess: JSON document (code), score=59]
Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.sdr/Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23bbd37f35b4b251f45c1bf42879a62d70.yjr: Unknown binary data (unknown) [Low-confidence model best-guess: Intel 80386 COFF (executable), score=76]
Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.sdr/AssetDownloadMetadata.meta: JSON document (code)
Whence-Energy-2_YI5JSRVWOVJ5MO6EWC57GJKTBZLXXI23.kfx: Unknown binary data (unknown) [Low-confidence model best-guess: BMP image data (image), score=89]

So except for one file, it's still a mystery what e-mailing a PDF to a Kindle reader with "convert" (without quotes) in the subject actually produces. The Kindle reader can read it, but so far nothing else I've tried can. In the end, I want to edit it and re-publish it on Amazon.

okular thinks the .kfx file is a mobipocket, but can't open it. It doesn't know what the .yjf or .yjr files are.

So, back to trying to find a competent PDF -to- ePub or PDF -to- mobi converter (I haven't yet tried texmate to create a mobi or ePub from the LaTeX).


Reply to: