Re: Stanford IDG internal Perl style
On 2013-03-29 18:24, Russ Allbery wrote:
> This is the evolution of my original style document, much modified by
> contact with Perl Best Practices. There are some things here that I'd
> personally do differently (I prefer omitting the parens around arguments
> to Perl built-ins, for example), but this was the compromise reached
> across the group among people with very different styles and different
> "first" languages.
>
Great :) I was just considering whether to codify a Lintian style guide.
> [[!meta title="Perl Coding Style"]]
>
> In general, follow the rules in the perlstyle man page and Perl Best
> Practices, except we use cuddled elses (else on the same line after }).
> And, of course, follow rules below rather than perlstyle if they
> conflict.
>
I don't think I have read the Perl Best Practises (assuming its the
O'Reilly one[1]), so I have to come back on that.
[1] http://refcards.com/docs/vromansj/perl-best-practices/refguide.pdf
> Note that these guidelines are for internal projects. If we release
> something as open source that needs to be compatible with Perl 5.8
> rather than 5.10 (which the document guidelines assump), there are
> the following exceptions:
>
s/assump/assume/ ?
> * Do not use autodie.
> * 'use base' instead of 'use parent'
> * Do not include Stanford::Infrared::*.
> * Other things to be added.
>
>
> # General Guidelines
>
> [...]
>
> * Always check the results of operations. For many functions this can
> be done with 'use autodie' at the start of a script or module. For
> anything not covered by autodie, you should add 'or die "some error:
> $!\n"' to the end of function calls that mail fail. For print and say,
> death checking is is provided by the Stanford::Infrared::Wrappers
> module with the functions print_fh, say_fh, print_stdout, and
> say_stdout.
>
s/is is/is/;
I have been considering to use autodie before (possible exception being
system/exec where it is always not compatible). I tend to prefer
exception based I/O instead of checking return values (which gets old
really soon), so I could be convinced to use autodie by default. Too
bad it does not cover all cases (print et al).
> [...]
>
> * Factor out long expressions in the middle of statements. It's better
> to read three clear statements than to have them all combined into a
> more complicated all-in-one longer line.
>
Is this:
my $result = part1
+ part2
+ part3;
vs:
my $result = part1 + part2 + part3;
(where partX is a "complex" computation) Or is it about splitting it
into temporary variables?
> * Use perltidy and perlcritic. Default [perltidyrc](perltidyrc) and
> [perlcriticrc](perlcriticrc) files are available.
>
If we can codify our style in a fashion these can check for that would
definitely help in maintaining the style. Also, we have a "perlcritic"
check already but it is skipped by default (and I never remember to set
the proper variables to enable it... >.>)
> [...]
>
> * Dereference variables with arrows for readability. (ie:
> "$record->{name}".
>
Also in "chaining"?
$record->{$last_name}->{$first_name}
$matrix->[$i]->[$j]
vs
$record->{$last_name}{$first_name}
$matrix->[$i][$j]
> [...]
> * For clarity in some fonts, try to make single-character strings more
> obvious as to their intent:
>
> '' q{}
> ' ' q{ }
> ',' q{,}
>
I do not think I understand this one. Is it to use the second column
variants instead of the first? I don't see how any of those expresses
their "intent" more than the other (but I might be a missing a reference
to the Perl Best Practises here?).
> [...]
>
> # Naming
>
> [...]
>
> * Name modules using Noun::Adjective::Adjective.
> (Disk::DVD::Rewritable)
>
Mmm... Is that strictly "one Noun followed by adjective(s)". Also the
example appear to be "Noun::Noun::Adjective" (or is a DVD an
adjective?). Anyway I assume "namespacing" is allowed (e.g.
Lintian::$module)
> [...]
>
> # Formatting and Indentation
>
> Don't use tabs in perl scripts, since they can expand differently in
> different environments. In particular, please try not to use the mix of
> tabs and spaces that is the default in Emacs.
>
> Please follow these guidelines for spacing and formatting:
>
> * Each block is indented four spaces. Continuations of commands should
> be indented two more spaces unless parenthesized, or indented to line
> up one space after the opening parentheses if parenthesized.
>
So if parentheses are used it should be lined up as... ?
really_long_name(argument1,
argument2,
argument3)
if (really_long_condition1
&& really_long_condition2
&& really_long_condition3) {
Or does this rule not apply to sub arguments?
> * Lines should be 79 characters or less. Continue long lines on the
> next line. Continue long strings by breaking the string before a space
> and using string concatenation (.) to combine shorter strings.
>
> * Use a space between keywords (if, elsif, while) and functions and
> parenthesized arguments. Do not put a space between the opening or
> closing parentheses and the contents. For example:
>
> if ($foo) {
> bar ($baz);
> }
>
In your comment to #704197, you suggested you had dropped the space
between the function/sub name and its parentheses? i.e.
if ($foo) {
bar($baz);
}
Or did I misunderstand you here?
> [...]
>
> * Parentheses are optional around print, die, and warn. Other
> built-ins should include the parentheses, as should any user-defined
> functions or methods. The empty parenthesis should be admitted from
> any method call taking no arguments, but should exist for regular
> functions. Any parentheses should have no space between the keyword
> and the open parenthesis.
>
s/admitted/omitted/ ?
> * map, grep, and sort are a special case; their first argument is a
> code block and should be set off by { } with spaces between the
> brackets and the code. This code block should never be within
> parentheses.
>
Is that still map(EXPR, LIST) for the alternative call variant?
> [...]
>
>
> # Control Structures
>
> * Only use postfix if or unless for flow control statements (next,
> last, return, etc) where there is nothing or almost nothing between the
> flow control statement and the if/unless. Code between the flow
> control statement and the if/unless can make the fact that it's a
> conditional less obvious and confuse some users.
>
I might be inclined to use postfix if/unless for "tag" and die as well?
> [...]
>
>
> # Subroutines
>
> [...]
> * Always return with an explicit return, to make certain that you are
> not returning an implicit value that may not be what you expect. Use a
> bare return to return failure, as 'return undef' will not do what you
> expect if the function was called in a list context.
>
exception being one-line anonymous subs/blocks?
> * Each function should have a leading comment that briefly describes
> what the function does and what arguments it takes. The format should
> be the following:
>
> [...]
>
This one sounds like double work with the POD documentation requirement
as well (except for internal functions)?
>
> [...]
> # Regular Expressions
>
> [...]
> * Use m{}xsm for matches by default. x is very useful for readability
> in longer regular expressions. m will match multiline strings better,
> and s will improve the handling of "." to match newlines in any
> multiline string. In many cases they aren't needed, but we want to
> encourage them as a default.
>
Given how rarely we seem to be maching multiline strings, I am not sure
it makes sense to use this rule in Lintian?
> [...]
>
> # Documentation
>
> [...]
>
> * Do not use interspersed POD to document functions. All POD should
> start after the code is done, with an __END__ statement before it.
>
> [...]
>
I have never quite understood this; especially not with the requirement
of having a second description at the function itself.
> # Writing Modules
>
> [...]
>
> * Use Exporter to export functions, and only do so by request (via
> @EXPORT_OK). Exporting all functions by default can clutter namespaces
> and lead to conflicts if a module and program add new functions. The
> recommended way to do module inheritence with Perl 5.10 and later would
> be to use Exporter by:
>
> use parent qw(Exporter);
>
Why this over "use Exporter qw(import);", which has been possible since
perl 5.8.3?
> [...]
>
> # Using Modules
>
> [...]
>
> * Use Getopt::Long::Descriptive for option parsing in any script that
> takes options. Give each option a corresponding single-character option
> unless it's rarely used. When listing the options, give the short
> option first, then |, and then the long option, as in:
>
> my ($opt, $usage = describe_options(
^
Missing a ")" ?
> '$0 $o <args>',
> [ 'd|delete', 'delete the given object' ],
> [ 'e|exclude=s', 'exclude any matching names' ],
> [ 'h|help', 'display help' ],
> ) or exit 1;
>
Perhaps we should convert Lintian to use this? If only so I don't have
to remember to update the usage with changes to cmd options. :P
> [...]
>
> * If you need to cache, use the Memoize module. It allows you to cache
> entire functions easily without changing the actual function, and with
> Memoize::Expire, create your own expiration policies for the cache.
>
Got anything on lazy loading stuff? L::Collect is basically one big
lazy-loader and it could do without requiring us to implement the lazy
loading itself. Maybe some smart use of Moose?
> * Look at the Stanford::Infrared::* modules to see if they cover any
> generalized cases you need, or to see if there are any cases you think
> they should cover. Stanford::Infrared::General covers general needs
> used in several places, with functions specific to Stanford ITS
> infrastructure. Stanford::Infrared::MySQL covers interfacing with the
> remctl commands in stanford-server-mysql, and other MySQL setup
> specific to that infrastructure. Stanford::Infrared::Wrappers covers
> things that are not as specific to Stanford, but are useful for our
> needs (such as an IPC::Open3 wrapper).
>
A URL for these ... Yeah, too lazy to google for it. :)
>
> # Recommendations
>
> [...]
>
> * Include the name of the script in error messages. This is
> particularly important for scripts run from cron or other scripts,
> although it's useful to have in general and never hurts. The easiest
> way to do this is to put something like the following. You should then
> prefix any warn, die, carp, or croak messages with '$0: ', and use $0
> in any messages that warn of incorrect option usage. If you use this
> format, the first character of the rest of the message should not be
> capitalized.
>
> my $fullpath = $0;
> $0 =~ s{ ^ .* / }{}xms;
>
> [...]
>
Alternatively, basename $0. What do you do about croak/exceptions from
module functions ?
eval { function(....); }
if ($@) {
die "$0: $@";
}
Or do you use "$0" inside function where it calls croak?
~Niels
Reply to: