[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#670040: works fine but conflicts with libpdfbox-java



Hi,

Please correct me if I'm wrong, but it seems to me that pdfannotextractor
does work fine provided that:

 1 - one runs pdfannotextractor --install once (which downloads pdfbox
     0.7.3 and saves it to ~/.texliveYYYY/..., so that it can be used
     later on).

 2 - one does not have the libpdfbox-java package installed, because in
     this situation the package from /usr/share/java/pdfbox.jar takes
     priority. For example we see:

    localhost ~ $ pdfannotextractor  --install
    PDFAnnotExtractor 0.1l, 2012/04/18 - Copyright (c) 2008, 2011, 2012 by Heiko Oberdiek.
    * Nothing to do, because PDFBox is already found:
      /usr/share/java/pdfbox.jar


IMO, the latter can be considered a problem, because it leads to the
error message encountered by the reporter. In a sense, the libpdfbox-java
package (which is pulled, e.g., by jabref) conflicts with that
functionality of texlive-latex-extra.

I suggest to amend the perl script
/usr/share/texlive/texmf-dist/scripts/pax/pdfannotextractor.pl so that it
deliberately skips a jar file which happens to *not* contain the class
org/pdfbox/cos/ICOSVisitor.class, as a hint that we're probably talking
to a newer pdfbox version which pdfannotextractor doesn't grok yet.

This is essentially what the attached patch does.

Cheers,

E.
--- a/usr/share/texlive/texmf-dist/scripts/pax/pdfannotextractor.pl	2012-04-19 18:53:50.000000000 +0200
+++ b/usr/share/texlive/texmf-dist/scripts/pax/pdfannotextractor.pl	2016-08-30 13:39:43.315971678 +0200
@@ -176,14 +176,20 @@
 }
 
 sub find_jar_pdfbox () {
+    check_prg $prg_unzip, 1;
     return if $path_jar_pdfbox;
     foreach my $dir (@dir_jar) {
         foreach my $jar (@jar_pdfbox) {
             my $path = "$dir/$jar";
             if (-f $path) {
-                $path_jar_pdfbox = $path;
-                debug $jar_pdfbox, $path_jar_pdfbox;
-                return;
+                my $e=`$prg_unzip -l $path`;
+                if ($e =~ m{org/pdfbox/cos/ICOSVisitor.class}) {  
+                    $path_jar_pdfbox = $path;
+                    debug $jar_pdfbox, $path_jar_pdfbox;
+                    return;
+                } else {
+                    print "Ignoring $path, because it misses org/pdfbox/cos/ICOSVisitor.class\n" if $debug;
+                }
             }
         }
     }

Reply to: