[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

(apologizes for the previous empty mail)

On Thu, Aug 30, 2012 at 11:44:34PM +0200, Andreas Tille wrote:
> On Thu, Aug 30, 2012 at 02:32:56AM +0200, Nicolas Boulenguez wrote:

> > Assume that "a" and "b" are directories, if I understand well, the
> > current behaviour is to recursively remove "a/b/", "a/b" and "a/" but
> > to ignore "a". I do not find that intuitive, and would suggest to
> > document it clearly.

> Could you suggest a wording which fits the requirement of "clearly
> documented".

Here is the meaning of Files-Excluded patterns, as I understand from
the perl code at

 If $pattern contains no slash, then
  (A) execute `find "$main_source_dir" -type f -name "$pattern" -delete`
   That is, remove all *files* in all subdirectories whose *base name* match.
  (B) remove the trailing slash from $pattern, if present
  (C) execute `find "$main_source_dir" -path "$main_source_dir/$pattern"
               -print0 | xargs -0 rm -rf`
   That is, remove all *files or subdirectories* whose *full path* match.

Jonas says that this imitates Files patterns, but
http://dep.debian.net/deps/dep5/#files-field mentions

 Patterns match pathnames that start at the root of the source tree.
 Thus, "Makefile.in" matches only the file at the root of the tree,
 but "*/Makefile.in" matches at any depth.

I understand that matching depends only on the existence of the path,
not on it being a file or a directory. So
- "foo" should match "foo", even if a directory.
- "foo" should not match "bar/foo", even if a file.
- "foo/" should never match, even if "foo" is a directory.
In short, I would only expect (C) as the implementation.

> It is required that there is distinction between files and
> directories (see the discussion on debian-devel, for instance [1] and
> [2]).
> [1] http://lists.debian.org/debian-devel/2012/08/msg00512.html
> [2] http://lists.debian.org/debian-devel/2012/08/msg00449.html

In [2], Jonas says "I believe it is better to...", but does not
explain why.

>  If I would feed 'a' to the removal algorithm it could simply
> happen, that a file c/a would be removed as well but it should not.

With (C), the "a" pattern will not remove the "c/a" file.

> In this case you can not specify something like
>     *.jar
> for jar files hanging around in lib/*.jar or so.  The code which tries
> to distinguish between files and pathes enables removing files in all
> directories.

With (C), the "*.jar" pattern will remove "lib/foo.jar" and "usr/lib/bar.jar".


This suggestion is unrelated, but I repeat is because it was hidden in
the huge quoted mail. If we keep the current meaning, I suggest that
the implementation at
avoid splitting and grepping twice.

    foreach (split /\s+/, $data->{"files-excluded"}) {
        if (grep {/\//}) {
            # delete trailing '/' because otherwise find -path will fail
            s?/+$?? ;
            # use rm -rf to enable deleting non-empty directories
            `find "$main_source_dir" -path "$main_source_dir/$_" -print0 | xargs -0 rm -rf`;
        } else {
            `find "$main_source_dir" -type f -name "$_" -delete`;

Reply to: