[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Facilitating contributions by newcomers



On Mon, Nov 10, 2014 at 8:01 PM, Paul Wise <pabs@debian.org> wrote:
> On Tue, Nov 11, 2014 at 4:17 AM, Riley Baird wrote:
>
>> I'm thinking that I could just create a new file data/mime and put the
>> following in it:
>
> That isn't really what I had in mind. I should have explained more
> clearly. The match field for a test matches files based on their names
> and the program passes the matched files to the tests. The mime
> support should add a field mime-match that would cause the program to
> match files based on their mime type (using python-magic) and add
> those to the list of files matched by the match field.
>

I worked on this a bit and I don't think that python-magic (the debian
package) is all that useful here. It looks like it will only return
the text representation eg "Perl script, ASCII text executable" and
not the MIME type eg "text/x-perl". The python-magic module on pypi
does return this information, but doesn't appear to be packaged for
debian. I am not sure how the package name and namespace conflict
would be handled if it were to be packaged and uploaded.

I have attached a diff of a working example using the built-in
mimetypes module. This isn't a very big improvement since it's still
based on file extensions, but changing the check to use the
python-magic module on pypi should be trivial. I do not consider this
diff to be merge ready, as I would want to refactor some of the code
to make the additions cleaner.

When {file} is used, I changed it to use the file(1) command to match
on the mime type. This is probably best, but it is definitely slower
than simple filename matching. Ideally, {file} should use os.walk and
not find, but that will get into rewriting a pretty good chunk of the
code.

Regards,
Jordan Metzmeier
diff --git a/check-all-the-things b/check-all-the-things
index 1d47a53..adb5421 100755
--- a/check-all-the-things
+++ b/check-all-the-things
@@ -32,6 +32,7 @@ import shlex
 import stat
 import subprocess as ipc
 import sys
+import mimetypes
 from textwrap import TextWrapper
 from shutil import get_terminal_size
 
@@ -55,6 +56,7 @@ class UnmetPrereq(Exception):
 class Check(object):
     def __init__(self):
         self.match = None
+        self.mime_match = None
         self._match_fn = id
         self.cmd = None
         self.cmd_nargs = None
@@ -86,6 +88,9 @@ class Check(object):
     def set_restrictions(self, value):
         self.restrictions = value.split()
 
+    def set_mime_match(self, value):
+        self.mime_match = value
+
     def get_sh_cmd(self, njobs=1):
         kwargs = {
             'files': '{} +',
@@ -96,18 +101,32 @@ class Check(object):
         if self.cmd_nargs > 0:
             fcmd = ['find -type f']
             if self.match is not None:
-                if len(self.match) == 1:
-                    [wildcard] = self.match
-                    fcmd += ['-iname', shlex.quote(wildcard)]
-                else:
-                    for wildcard in self.match:
-                        fcmd += ['-o', '-iname', shlex.quote(wildcard)]
-                    fcmd[1] = '\\('
-                    fcmd += ['\\)']
+                fcmd = self.add_find_match(fcmd)
+            elif self.mime_match is not None:
+                fcmd = self.add_find_mime_match(fcmd)
             fcmd += ['-exec', cmd]
             cmd = ' '.join(fcmd)
         return cmd
 
+    def add_find_match(self, fcmd):
+        if len(self.match) == 1:
+            [wildcard] = self.match
+            fcmd += ['-iname', shlex.quote(wildcard)]
+        else:
+            for wildcard in self.match:
+                fcmd += ['-o', '-iname', shlex.quote(wildcard)]
+                fcmd[1] = '\\('
+                fcmd += ['\\)']
+
+        return fcmd
+
+    def add_find_mime_match(self, fcmd):
+        fcmd += ['-exec', 'bash', '-c',
+                 """'[[ "$(file --mime $1)" == *%s* ]]'""" % self.mime_match,
+                 '--', '{}', '\;']
+        return fcmd
+
+
     def meet_prereq(self):
         if self.prereq is None:
             cmd = shlex.split(self.cmd)[0]
@@ -124,8 +143,14 @@ class Check(object):
             except ipc.CalledProcessError:
                 raise UnmetPrereq('command failed: ' + self.prereq)
 
+    def _match_mime(self, path):
+        return mimetypes.guess_type(path)[0] == self.mime_match
+
     def is_file_matching(self, path):
-        return self._match_fn(path)
+        if self.mime_match:
+            return self._match_mime(path)
+        else:
+            return self._match_fn(path)
 
 def parse_section(section):
     check = Check()
@@ -169,6 +194,7 @@ def main():
                     continue
                 if check.is_file_matching(path):
                     matching_checks.add(name)
+
     for name, check in sorted(checks.items()):
         if not name in matching_checks:
             skip(skipped, name, 'no matching files')
diff --git a/data/perl b/data/perl
index 74731da..26243ba 100644
--- a/data/perl
+++ b/data/perl
@@ -1,10 +1,10 @@
 [perl-syntax-check]
-match = *.pl *.pm
+mime_match = text/x-perl
 command = perl -wc {file}
 # TODO: grep -v ' syntax OK$'
 
 [perl-b-lint]
-match = *.pl *.pm
+match = text/x-perl
 prereq = perl -MO=Lint /dev/null
 command = perl -MO=Lint {file}
 # TODO: grep -v ' syntax OK$'

Reply to: