Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
(apologizes for the previous empty mail)
On Thu, Aug 30, 2012 at 11:44:34PM +0200, Andreas Tille wrote:
> On Thu, Aug 30, 2012 at 02:32:56AM +0200, Nicolas Boulenguez wrote:
> > Assume that "a" and "b" are directories, if I understand well, the
> > current behaviour is to recursively remove "a/b/", "a/b" and "a/" but
> > to ignore "a". I do not find that intuitive, and would suggest to
> > document it clearly.
> Could you suggest a wording which fits the requirement of "clearly
> documented".
Here is the meaning of Files-Excluded patterns, as I understand from
the perl code at
http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493
If $pattern contains no slash, then
(A) execute `find "$main_source_dir" -type f -name "$pattern" -delete`
That is, remove all *files* in all subdirectories whose *base name* match.
Else
(B) remove the trailing slash from $pattern, if present
(C) execute `find "$main_source_dir" -path "$main_source_dir/$pattern"
-print0 | xargs -0 rm -rf`
That is, remove all *files or subdirectories* whose *full path* match.
Jonas says that this imitates Files patterns, but
http://dep.debian.net/deps/dep5/#files-field mentions
Patterns match pathnames that start at the root of the source tree.
Thus, "Makefile.in" matches only the file at the root of the tree,
but "*/Makefile.in" matches at any depth.
I understand that matching depends only on the existence of the path,
not on it being a file or a directory. So
- "foo" should match "foo", even if a directory.
- "foo" should not match "bar/foo", even if a file.
- "foo/" should never match, even if "foo" is a directory.
In short, I would only expect (C) as the implementation.
> It is required that there is distinction between files and
> directories (see the discussion on debian-devel, for instance [1] and
> [2]).
> [1] http://lists.debian.org/debian-devel/2012/08/msg00512.html
> [2] http://lists.debian.org/debian-devel/2012/08/msg00449.html
In [2], Jonas says "I believe it is better to...", but does not
explain why.
> If I would feed 'a' to the removal algorithm it could simply
> happen, that a file c/a would be removed as well but it should not.
With (C), the "a" pattern will not remove the "c/a" file.
> In this case you can not specify something like
> *.jar
> for jar files hanging around in lib/*.jar or so. The code which tries
> to distinguish between files and pathes enables removing files in all
> directories.
With (C), the "*.jar" pattern will remove "lib/foo.jar" and "usr/lib/bar.jar".
----------------------------------------------------------------------
This suggestion is unrelated, but I repeat is because it was hidden in
the huge quoted mail. If we keep the current meaning, I suggest that
the implementation at
http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493
avoid splitting and grepping twice.
foreach (split /\s+/, $data->{"files-excluded"}) {
if (grep {/\//}) {
# delete trailing '/' because otherwise find -path will fail
s?/+$?? ;
# use rm -rf to enable deleting non-empty directories
`find "$main_source_dir" -path "$main_source_dir/$_" -print0 | xargs -0 rm -rf`;
} else {
`find "$main_source_dir" -type f -name "$_" -delete`;
};
};
Reply to: