[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] fix regexp for blacklist devices



Hi,

anonym wrote (03 May 2012 12:44:47 GMT) :
> I think the Right Thing(TM) to do is to implement a robust
> is_in_list() function instead of using ad-hoc grepping, both here
> and and at other places in the code.

Full ACK.

> See the attached patch. The is_in_list() function therein:

> * interprets white-spaces and newlines as separators; all other
>   chacaters form "words", so there should be no more surprises.

I find this design decision sane to make this function useful and as
robust as possible in the context of live-boot.

However, the function semantics is different from how argument lists
work in shell usually: the shell has a slightly more elaborate vision
of what a list is, e.g. it supports list elements that contain whitespace.

Therefore, I think the function's name should reflect the "list" it's
about is *not* a "normal" shell list. Perhaps something like
is_in_whitespace_separated_list. Yes, that's ugly. But with the
current function name, while I am in a shell programming mindset, the
way it is implemented would probably surprise me more than once.

> * doesn't depend on quoting input to form "lists".

What do you mean exactly?

> * doesn't use any fancy grep functionality (no -w, no \< or \>),
>   just basic regex stuff that ought to be correctly implemented
>   in busybox.

Just to be sure, have you verified it works properly with
busybox grep?

> * makes the code much more readable.

Sure.


Also, I was surprised to discover that `is_in_list a b a` is wrong in
dash, while it's true in bash and in busybox sh. After `local l=${@}'
is evaluated, $l contains "b" in dash, while it contains "b a" in bash
and busybox sh. The rest of the code (where you migrate a few `grep
"$list"' to `is_in_list elt $list', removing quotes along the way)
assumes bash and busybox behaviour. For what it's worth, replacing
that line with `local l="${*}"' makes all these three shells think, in
perfect agreement, that `is_in_list a b a' is true, which I would find
much saner, given the idea is to avoid a certain class of subtle, hard
to understand future bugs :)

> +	echo ${l} | grep -qe "^\(.*[[:space:]]\)\?${e}\([[:space:]].*\)\?$"

If the regexp language in use knows about them, these should really be
non-capturing, grouping parenthesis -- e.g. in Perl regexp syntax, one
would write the first group like this: (?:.*[[:space:]])?

(It looks like the busybox package only installed busybox(1) in terms
of busybox documentation on my Debian sid, and alas, this manpage does
not document what kind of pattern is supported by their grep, so
I can't check myself right now.)

Cheers,
-- 
  intrigeri
  | GnuPG key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc
  | OTR fingerprint @ https://gaffer.ptitcanardnoir.org/intrigeri/otr.asc


Reply to: