A Perl yardstick

chromatic identified some reasons why it is difficult to hire great Perl programmers, and then followed it up with some advice for hiring managers. They suggest a series of questions every good Perl programmer should be able to answer. I wanted to know how I stacked up.

Here we go...

What do variable sigils indicate in Perl 5?

They indicate how the variable is being used: $ is for a scalar (reference, filehandle, string, number); @ is an array; % is a hash (an even-sized list); & is a subroutine; and * is a typeglob.

my @array = qw( one two three);
my $scalar = $array[0]; # We use a $ sigil because element 0 is a scalar

What's the difference between accessing an array element with $items[$index] and@items[$index]?

Using a $ retrieves the identified element, which is a scalar. @ takes an array slice from $index to the end. While both these will get your value, the difference can be important, for example:

@arr[0] = <>; # reads the whole file, "disappearing" everything after the first line
$arr[0] = <>; # reads the first line, as intended

What's the difference between == and eq?

Perl uses different comparison operators for strings (eq, ne, lt, gt, cmp) and numbers (==, !=, < , >, < =>).

What do you get if you evaluate a hash in list context?

You'll get an even-sized array of alternating keys and values. Note that the order of your array is deliberately unpredictable. While no particular order of key-value pairs is guaranteed, you are guaranteed to get your keys next to your values.

How do you look up keywords in the Perl documentation?

perldoc -f $keyword (or maybe you meant perldoc -q $regex to search perlfaq?)

What is the difference between a function and a method in Perl 5?

A method is a function which is invoked as 

Class::method($Class, @args)

for class methods and 

Class::method($obj, @args)

for instance/object methods.

When does Perl 5 reclaim the memory used by a variable?

Perl's garbage collector uses reference-counting. An internal count of references to any data structure is kept; when the count reaches zero, the memory is reclaimed. Perl does not use mark-and-sweep like Java, so circular memory structures cannot be reclaimed without programmer assistance (ie. a DESTROY method), except at the end of execution.

How do you ensure that the default scope of a variable is lexical?

think what this is getting at is that use strict will require you to specify the package name for every variable you declare, and my makes them lexical.

How do you load and import symbols from a Perl 5 module?

use -- or require + import

How can you influence the list of directories from which perl attempts to load modules?

Perl's -I flag, or use lib '../my/lib'; or use local::lib;.

How do you look up error messages in the Perl 5 documentation?

(Award bonus points for knowing how to enable explanations of all error messages encountered.)

perldoc perldiag has the Perl diagnostic messages, and use diagnostics will give you the explanations as they're encountered.

What happens when you pass an array into a function?

The array loses its identity, and the elements appear in @_:

my @args = qw( one two three );
mysub(@args);
    sub mysub {
    say "@_"; # "one two three\n"
    say $_[0]; # "one\n", not "ARRAY(0xa5dfb8)\n"
    say @{ $_[0] }; # "Can't use string ("one") as an ARRAY ref while "strict refs" in use at ..."
}

How do you pass one or more distinct arrays into a function?

You pass arrayrefs (or use a prototype, shudder).

What is the difference, on the caller side, between return; and return undef;?

The intent here is to return false so the caller can say do_this() or die; When called in scalar context, return undef; will do what you expect. In listcontext, however, this returns a one-element list, which evaluates to true - the opposite of what you intended. You should instead raise an exception via die (or better yet Carp::croak, or at the very least, have a bare return;.

Where do tests go in a standard CPAN distribution?

In ./t/*.t - though ./test.pl is not unheard of.

How do you run tests in a standard CPAN distribution?

prove -lrw t
# or
perl Makefile.PL && make && make test

What command do you use to install new distributions from the CPAN?

Although there are newer clients, I like vanilla cpan: cpan Task::Toolchain::Test.

Why would you use the three-argument form of the open builtin?

By separating the open mode from the filename, you needn't worry about the filename having character that might alter the open mode you intended, or files which happen to have names Perl considers magical like -.

How do you detect (and report) errors from system calls such as open?

(Award bonus points for knowing how to enable automatic detection and reporting of errors)

The autodie pragma replaces builtins which return false on failure (like open) with ones that die instead.

How do you throw an exception in Perl 5?

die, but it ain't pretty.

How do you catch an exception in Perl 5?

eval, but it ain't pretty.

What is the difference between reading a file with for and with while?

Reading a file with for reads in the whole file all at once (called slurping, from the Perl 6 builtin, which incidentally comes to Perl 5 as File::Slurp), which is usually not what you want. Instead, you can read line-by-line with while:

while(my $line = <>) {
    ...;
}

How do you handle parameters within a method or function in Perl 5?

I'm not sure what the question is asking, actually.

What do parentheses mean around the variable name in my ($value) = @_; mean, and what would happen if you removed them?

They create a list context, making this equivalent to 

my $value = $_[0];

or close to 

my $value = shift;

. If you removed them, the statement would be evaluated in scalar context, which means $value would be the size/length of @_.

Is new a builtin function/keyword?

No.

How do you access only the values of a Perl 5 hash?

With values, of course.

Conclusion

chromatic says a good Perl 5 programmer should be able to answer at least 80% of these with no trouble - I think I've achieved that.

Comments
Comment from Peter Rabbitson - January 31, 2011 at 7:22 am

Since nobody replied yet, I figured I'll correct some nitpicks here and there. Please note - this is not an attempt to belittle your knowledge of perl, which is more than adequate. View it as a service to you and your readers who may come across this text. Also note that one of the questions chromatic posted make no sense, and hence does not have a correct answer ("difference between function and method"). So here we go:

What do variable sigils indicate in Perl 5?
They indicate how the variable is being used: $ is for a scalar (reference, FILEHANDLE,...)

This might be a simple thinko, or it may not be: but a scalar variable can not contain a "filehandle". It can contain only scalars (strings, numbers and references). In the case of a filehandle we simply store a glob-reference, the actual filehandle is a glob.
Furthermore if you really want to be precise $foo, @foo and %foo are related even more closely than one would think, but unfortunately the documentation of globs generally sucks: there are only some relevant bits in `perldoc perlref`

What’s the difference between == and eq?
...different comparison operators for strings ... and numbers

To be precise it has to do with context. 11 gt 2 *is* valid, it will just give you back an odd result if you expect numerical comparison

What do you get if you evaluate a hash in list context?... the order of your array is deliberately unpredictable...

This is a mistake I see often. Consider: http://pastebin.com/7CvnaEZJ. Note that the only version that in fact permutes hash keys is 5.8.1. You will get the same result with your system perl as well. The reason for this is that the protection introduced in 5,8,1 was deemed too expensive (the randomization of *every* hash) so starting with 5.8.2 the randomizer will kick in *only* if a particular hash bucket starts overflowing (i.e. when an attack is in progress). The downside of this is that folks often write tests that implicitly rely on a specific ordering of the keys of a specific hash without ever realizing it.

When does Perl 5 reclaim the memory used by a variable?
... so circular memory structures cannot be reclaimed without programmer assistance (ie. a DESTROY method) ...

This is mildly incorrect (or to be precise it is correct only in a very specific context). DESTROY is the method that fires during destruction of an object, which in turn takes place when it lost all its references. If an object itself is part of a circular structure, then its DESTROY will never fire, because the underlying blessed variable is itself leaked and hence sort of immortal. DESTROY is only useful for breaking circular references to which a specific "handler" object has access.

What is the difference, on the caller side, between return; and return undef;?
...In listcontext, however, this returns a one-element list, which evaluates to true – the opposite of what you intended...

I am not quite sure what you mean by this. A list evaluates to its size in scalar context, yet we are explicitly in list context, hence there is no mixup of intentions. There is no "better" or "worse" approach, whether an explicit undef should be returned depends depends on what the function in question is intended to do.





Comment from hdp - February 1, 2011 at 1:19 pm

The answer to "what is the difference between a method and a function in Perl 5" is "nothing beyond documentation".

It makes perfect sense; contrast with other languages where e.g. functions and methods exist in different namespaces.

Comment from DLS - February 2, 2011 at 2:05 pm

@Peter: Excellent comments for added depth, except the last point.

What is the difference, on the caller side, between return; and return undef;?
…In listcontext, however, this returns a one-element list, which evaluates to true – the opposite of what you intended…

It may be desired that you really want to return a value of 'undef' but this is rare. Usually you want to return an object (which may be an array) or you can't.

Mike is exactly right - you should throw an exception, or use a bare return to indicate failure.

The key is always say what you mean, and 'return undef' is almost never what you mean.

See PBP, page 199-200