[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: request for review: lbzip2

On Mon, 5 Oct 2009, Ben Finney wrote:

This forum is certainly the right place to ask for an review of English
language usage in a Debian package, thanks for presenting your package
for review.

I thank you all for looking at it.

What does it mean for the reader that this is a ˙˙parallel bzip2 filter˙˙?
Is it a filter of ˙˙parallel bzip2˙˙, or is it a ˙˙parallel filter˙˙ of
bzip2 data?

It is a unix filter (program reading from stdin and writing to stdout) that can compress and decompress using the bzip2 algorithm and bz2 file format, harnessing multiple processor cores in parallel. (*)

You can assume that the synopsis will be further explained by the long
description, but I would want ˙˙parallel˙˙ and ˙˙filter˙˙ to be less
ambiguous here. Perhaps ˙˙multi-threaded implementation of bzip2 codec˙˙,
if that actually is an accurate synopsis.

"multi-threaded" is a good idea, but I'd like to use both the "parallel" and "thread" words in the full (= short + long) description, so that people searching for either word can find the package.

 Lbzip2 is a Pthreads-based parallel bzip2/bunzip2 filter, passable to GNU tar
 with the --use-compress-program option.

Since this is case-sensitive Unix, presumably ˙˙lbzip2˙˙ names a different
entity from ˙˙Lbzip2˙˙. On that assumption, I will recommend that you
don't change its capitalisation for the beginning of a sentence.

Fixed, thanks.

The latter clause might be simpler as a separate sentence:

   It can be used by GNU ˙˙tar˙˙ as an external compression program.

To tell the truth, I wanted to squeeze as much immediately usable technical information into the long description as possible. lbzip2 is meant for "power users", it's deliberately not a drop in replacement for bzip2. When I'm looking for a "power tool", I like to see as much specifics as possible in the output of 'apt-cache show'. It's better suited for searching and perhaps I can save time by deciding *against* it without downloading it, installing it, reading its manual, and finally purging it.

 It isn't restricted to regular files
 on input, nor output. Lbzip2 utilizes multiple threads and an input-bound
 splitter even when decompressing bz2 files created by standard bzip2.
 Successful splitting for decompression isn't guaranteed, just very likely
 (failure is detected). Splitting in both modes and compression itself occur
 with an approximate 900k block size.

This seems like far too much attention to implementation details. Could
you try re-phrasing this to address an audience who wants to know
whether or not they want this package on their system, and who may not
yet know much of anything about the specific details of what the package
is trying to do?

This paragraph tells the user exactly what lbzip2 does, what its limitations and "unique" features are. I'll write more on why I consider this a good idea in my reply to Justin B Rye's mail.

 On an Athlon-64 X2 6000+, lbzip2 was 92% faster than standard bzip2 when
 compressing, and 45% faster when decompressing (based on wall clock time).
 On a 2x Quad-Core Opteron 2352, lbzip2 was 588% faster than standard bzip2
 when compressing, and 394% faster when decompressing (based on wall clock

These seem quite redundant, and very likely to be out of date whenever a
new release is made. Why are they in the description at all?

This is the (truthful) marketing blurb to quickly lure users into trying out lbzip2. "Wow, so much speedup on an old machine!", they exlaim, furiosly typing "apt-get install". :)

On a more serious note, this section doesn't need to be up-to-date at all. It should just give an example that lbzip2 is for real (and was for real, "ages ago"), but be correct at the same time.

Still, I'd happily trade these last two paragraphs for a more user-friendly introduction (= first paragraph) that you approve of, shifting down the middle. Something like I marked with (*) above.


Reply to: