[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

On Wed, Mar 11, 2009 at 5:34 PM, Samuel Thibault
<samuel.thibault@ens-lyon.org> wrote:
> Ole Tange, le Wed 11 Mar 2009 17:05:34 +0100, a écrit :
>> One of friends alerted me to your discussion of 'parallel' and whether
>> other tools can replace it.
> The question could also be rephrased: can't we just extended xargs into
> supporting what parallel does?  Having two separate tools will always
> make arguments about "A does this, B doesn't" and vice-versa, while
> xargs could just do everything.

When I started coding 'parallel' I thought of extending 'xargs' for
exactly the same reason. I decided against it for two reasons:

* My C-skills are rusty and I never really liked C, so I would have to
convince someone else to write it.

* I would not be able to change xargs so it became incompatible with
the current version of xargs. This would make it impossible to change
the default behaviour of xargs to something that would make sense 99%
of the time.

xargs default of treating

  printf "foo bar" | xargs echo

the same as:

  printf "foo\nbar" | xargs echo

is not what I want in 99% of the cases. In 90% of the cases I do not
care, and the the last 9% I get burned by the behaviour.

I have been burned quite a few times by xargs for not dealing nicely
with input that is \n separated but which contains interesting
characters such as space or '. parallel's primary purpose was to be
run interactively for things that are only to be run once; if running
the same input with xargs requires a lot of pre+postprocessing and
special options, then it is easier to extend parallel to include what
xargs does (BTW next version of parallel will support -x which will
insert as many arguments as command line length permits).

I believe the man page of xargs shows a good example of the problem of
xargs not doing what the user expects. The first example says:

  find /tmp -name core -type f -print | xargs /bin/rm -f

       Find files named core in or below the directory /tmp and delete
them.  Note that this
       will work incorrectly if there are any filenames containing
newlines or spaces.

But even the man page writer forgets that if any of the dirs contain a
' or a ` or a " you still have to remember -0. If a dir is called
"\\'" then xargs will not even complain but silently fail to remove
the file.

To me the default should work in most cases and not cause the user to
rethink the strategy. On my alpha testers I tried out different
defaults to get to a default setting that would do what they expected
in most cases.


Reply to: