[Qs] re "Abigails Coding Guidelines"

J

J Krugman

[ This post appears to have gone to never-never land, so I'm
reposting; sorry for any repeats. ]

I recently came across "Abigails Coding Guidelines", at

http://perl.abigail.nl/Musings/coding_guidelines.html

which I found on the whole very interesting and instructive. But
some points (especially guideline 16) went over my head, or (like
guideline 3) prompted new questions, and I was hoping someone would
clue me in.

* Guideline 3 says "All system calls should be checked,
including, but not limited to, close, seek, flock, fork and exec."

Is there a comprehensive list of *all* of Perl's system calls?

* Guideline 5 says "Signals can be sent to the program. There are
default actions--but they are not always appropriate. If not,
signal handlers need to be installed. Care should be taken since
not everything is reentrant."

What does "reentrant" mean?

* In (15), "Subroutines in standalone modules SHOULD perform argument
checking and MUST NOT assume valid arguments are passed."

I'm not clear on why the Guidelines specify "standalone" modules.
What's are examples of standalone and non-standalone modules?

* Guideline (16) is this:

"Objects SHOULD NOT use data inheritance unless it is appropriate.

This means that 'normal' objects, where attributes are stored
inside anonymous hashes or arrays should not be used. Non-OO
programs benefit from namespaces and strictness, why shouldn't
objects? Use objects based on keying scalars, like fly-weight
objects, or inside-out objects. You wouldn't use public
attributes in Java all over the place either, would you?"

This one is 100% over my head! *Every word* of it. If someone
could spell it out for me I'd appreciate it. More specifically,
what is/are "keying scalars" (I can't tell whether "keying" here
is meant as a gerund or as an adjective)? What are fly-weight
objects and inside-out objects? And what do they have to do with
overusing public attributes in Java? Also, it is not clear to
me what's wrong with storing object attributes in anonymous hashes
or arrays, and in particular, it is not clear to me why using
them amounts to not benefiting from "namespaces and strictness."


Many thanks in advance,

jill
 
G

gnari

[about Abigails Coding Guidelines]
* Guideline 5 says "Signals can be sent to the program. There are
default actions--but they are not always appropriate. If not,
signal handlers need to be installed. Care should be taken since
not everything is reentrant."

What does "reentrant" mean?

a bit of code is said to be reentrant when it can tolerate to be
entered again before finishing. if a new signal is sent to a program
while it is still processing a previous one, the signal handler needs
to be reentrant. example of non-reentrant behaviour is to store
critical temporary information in globals.
* In (15), "Subroutines in standalone modules SHOULD perform argument
checking and MUST NOT assume valid arguments are passed."

I'm not clear on why the Guidelines specify "standalone" modules.
What's are examples of standalone and non-standalone modules?

my guess is that non-"standalone" modules are modules that are only
written to support a particular program (or set of programs), and
are typically distributed with them.

"standalone" modules are modules intended to be used with any
program. CPAN modules are typically "standalone". example of
non-standalone modules *might* be the DBD modules, that only expect
to be called by DBI and may perform less strict tests on some arguments.

* Guideline (16) is this:

"Objects SHOULD NOT use data inheritance unless it is appropriate.
Also, it is not clear to
me what's wrong with storing object attributes in anonymous hashes
or arrays, and in particular, it is not clear to me why using
them amounts to not benefiting from "namespaces and strictness."

one part of OO programming is encapsulation. in perl, if a class inherits
data from a parent class, it's author needs to make sure any new attributes
do not clash with "private" attributes of any ancestors. thus they are not
really private, and no namespace benefits exist.

gnari
 
B

Ben Morrow

Quoth J Krugman said:
[ This post appears to have gone to never-never land, so I'm
reposting; sorry for any repeats. ]

I recently came across "Abigails Coding Guidelines", at

http://perl.abigail.nl/Musings/coding_guidelines.html

which I found on the whole very interesting and instructive. But
some points (especially guideline 16) went over my head, or (like
guideline 3) prompted new questions, and I was hoping someone would
clue me in.

* Guideline 3 says "All system calls should be checked,
including, but not limited to, close, seek, flock, fork and exec."

Is there a comprehensive list of *all* of Perl's system calls?

System calls are a function of your operating system, not Perl. On a
unix-like system, you can check the contents of section 2 of the
manpages.

Having said that, a syscall is how you communicate with the world
outside your process; so any function in perlfunc that does more than
manipulate data makes a syscall.
* Guideline 5 says "Signals can be sent to the program. There are
default actions--but they are not always appropriate. If not,
signal handlers need to be installed. Care should be taken since
not everything is reentrant."

What does "reentrant" mean?

Reentrant means that a piece of code can stand being executed twice at
the same time. Basically code is non-reentrant if it uses global
resources without locking them, so this:

our ($subref, $string);

# call with a subref to call and a string to print
sub call_and_print {
($subref, $string) = @_;

$subref->();
print $string;
}

is not reentrant, as if you write
call_and_print \&call_and_print, 'hello world'
the second copy of the sub will stomp on the global, and the data to
print will no longer be there. This is obviously contrived: it is hard
to make code non-reentrant in Perl.

C is a quite different matter, though, and perl-the-C-program makes
extensive use of global variables. It used to be the case that signal
handlers you installed in Perl would run as soon as the signal arrived,
when perl might well be halfway through running a particular opcode. As
the guts of perl are not reentrant, this could mess things up; the
chances of this were much much higher if you did anything more
complicated in a signal handler than setting a variable which already
had a value.

Since 5.8 this has been fixed, as the C signal handler simply makes a
note that perl should call the Perl signal handler as soon as everything
is in a consistent state and it is safe to do so.
* In (15), "Subroutines in standalone modules SHOULD perform argument
checking and MUST NOT assume valid arguments are passed."

I'm not clear on why the Guidelines specify "standalone" modules.
What's are examples of standalone and non-standalone modules?

An example of a non-standalone module is Carp::Heavy: it has notices on
it saying it is only to be used through the interface of Carp, so it can
safely assume that Carp has set everything up correctly where this is
helpful.
* Guideline (16) is this:

"Objects SHOULD NOT use data inheritance unless it is appropriate.

This means that 'normal' objects, where attributes are stored
inside anonymous hashes or arrays should not be used. Non-OO
programs benefit from namespaces and strictness, why shouldn't
objects? Use objects based on keying scalars, like fly-weight
objects, or inside-out objects. You wouldn't use public
attributes in Java all over the place either, would you?"

This one is 100% over my head! *Every word* of it. If someone
could spell it out for me I'd appreciate it. More specifically,
what is/are "keying scalars" (I can't tell whether "keying" here
is meant as a gerund or as an adjective)? What are fly-weight
objects and inside-out objects? And what do they have to do with
overusing public attributes in Java? Also, it is not clear to
me what's wrong with storing object attributes in anonymous hashes
or arrays, and in particular, it is not clear to me why using
them amounts to not benefiting from "namespaces and strictness."

As to the problem, consider your typical Perl object: it is a blessed
hashref, with data stored in the hash. The trouble with this is that
subclasses need to know both that it is a hashref and what keys in the
hash the parent object has used, so that they don't stomp on the parents
data. The first violates encapsulation (it should be possible to change
any aspect of the implementation without anything breaking); the second
is exactly the same problem as making all your variables global, which
Perl deals with with packages (namespaces) and 'use strict'.

As to the solutions, I have to say that the more advanced forms of Perl
OO are way over my head too...

Ben
 
A

Anno Siegel

gnari said:
[about Abigails Coding Guidelines]
* Guideline 5 says "Signals can be sent to the program. There are
default actions--but they are not always appropriate. If not,
signal handlers need to be installed. Care should be taken since
not everything is reentrant."

What does "reentrant" mean?

a bit of code is said to be reentrant when it can tolerate to be
entered again before finishing. if a new signal is sent to a program
while it is still processing a previous one, the signal handler needs
to be reentrant.

Unfortunately, it's not only the signal handler that must be re-entrant
in the presence of signal processing, but every public subroutine. A
call can be interrupted, and the signal handler could call the same
routine. Since you don't know at coding time which routines may be
called in a handler, they must all be safe.

Anno
 
G

gnari

Anno Siegel said:
gnari said:
[about Abigails Coding Guidelines]
* Guideline 5 says "Signals can be sent to the program. There are
default actions--but they are not always appropriate. If not,
signal handlers need to be installed. Care should be taken since
not everything is reentrant."

What does "reentrant" mean?

a bit of code is said to be reentrant when it can tolerate to be
entered again before finishing. if a new signal is sent to a program
while it is still processing a previous one, the signal handler needs
to be reentrant.

Unfortunately, it's not only the signal handler that must be re-entrant
in the presence of signal processing, but every public subroutine. A
call can be interrupted, and the signal handler could call the same
routine. Since you don't know at coding time which routines may be
called in a handler, they must all be safe.

that is why it is best to have the signal handler do as little as
possible, and try to defer most work to the program proper via
flags. (when i say signal handler here i mean not only the %SIG entry,
but also all subs called by it)

gnari
 
L

Lukas Mai

J Krugman schrob:
[...]
* Guideline (16) is this:
"Objects SHOULD NOT use data inheritance unless it is appropriate.
This means that 'normal' objects, where attributes are stored
inside anonymous hashes or arrays should not be used. Non-OO
programs benefit from namespaces and strictness, why shouldn't
objects? Use objects based on keying scalars, like fly-weight
objects, or inside-out objects. You wouldn't use public
attributes in Java all over the place either, would you?"
This one is 100% over my head! *Every word* of it. If someone
could spell it out for me I'd appreciate it. More specifically,
what is/are "keying scalars" (I can't tell whether "keying" here
is meant as a gerund or as an adjective)? What are fly-weight
objects and inside-out objects? And what do they have to do with
overusing public attributes in Java? Also, it is not clear to
me what's wrong with storing object attributes in anonymous hashes
or arrays, and in particular, it is not clear to me why using
them amounts to not benefiting from "namespaces and strictness."

Others have explained what's wrong with the usual hash representation of
objects. I'll try to write a short flyweight objects demonstration.

(Warning: untested code)
{
package Some::Module;

use warnings;
use strict;

my %attr;

# constructor
sub new {
my $class = shift;
my $obj = bless [], $class;
$attr{$obj} = {
foo => undef,
};
return $obj;
}

# destructor
sub DESTROY {
my $self = shift;
delete $attr{$self};
}

# accessor
sub foo {
my $self = shift;
return $attr{$self}{foo} = $_[0] if @_;
return $attr{$self}{foo};
}
}

As you can see, the object itself is just an empty reference. All object
attributes are stored in the (lexically scoped) hash %attr, which can't
be accessed from outside the module. The (address of the) object serves
as a key into this table. This makes all attributes private; not even
subclasses can see them. It's a bit unusual for a Perl module that a
destructor is needed here to clean up dead objects, but it makes sense,
as all attribute data is stored outside of the object.

HTH, Lukas
 
K

kj

In said:
that is why it is best to have the signal handler do as little as
possible, and try to defer most work to the program proper via
flags.

Hmmm. But to be at all useful, those flags would have to be global,
which, if I've followed the discussion, would make the handler
non-reentrant, no?

kj
 
A

Anno Siegel

kj said:
Hmmm. But to be at all useful, those flags would have to be global,
which, if I've followed the discussion, would make the handler
non-reentrant, no?

Yes, technically it does, and consequently there is a (tiny)
probability that a program of this type misses a signal. Look
at an example. This code tries to catch two SIGINTs and only
die on the third one.

my $signal;
$SIG{ INT} = sub { $signal = shift };

my $killcount = 0;
while ( 1 ) {
sleep 1; # ... or do something useful
if ( $signal ) {
$killcount ++;
# a signal that arrives before this point will be missed
undef $signal;
}
die "3 kills" if $killcount >= 3;
}

A signal that arrives after the signal handler has returned to the
main loop, but before "undef $signal" is executed will be ignored.

In the example this means that one out of so many (probably millions)
of ^Cs from the keyboard will be missed, which will usually be tolerable.

If missing a signal can't be afforded, the sig handler must block
signals, and the "if ( $signal ) {..." code must unblock them again,
*after* undefing $signal. The details how (and perhaps if) that can
be done depend much on the system.

Anno
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
gnari
a bit of code is said to be reentrant when it can tolerate to be
entered again before finishing. if a new signal is sent to a program
while it is still processing a previous one, the signal handler needs
to be reentrant. example of non-reentrant behaviour is to store
critical temporary information in globals.

Correct, but moot in case of Perl signal handling. As my testing a
decade ago shows, *any* Perl signal handler is unsafe; not just for
signal-inside-signal-handler, but actually for any signal handling.
Since what is buggy is the C wrapper which invokes the Perl signal
handler, this is not fixable from Perl code.

I do not remember this problem to be ever fixed. (Google for my
messages on voodoo patch on this newsgroup for gory details.) The
only situation when the handler has low probability of crashing the
program is that the signal arrives when Perl is blocked in a system
call; if signal arrives when the interpreter is "active", the
probability of failure is about 3%.

Hope this helps,
Ilya
 
B

Ben Morrow

Quoth Ilya Zakharevich said:
[A complimentary Cc of this posting was sent to
gnari
a bit of code is said to be reentrant when it can tolerate to be
entered again before finishing. if a new signal is sent to a program
while it is still processing a previous one, the signal handler needs
to be reentrant. example of non-reentrant behaviour is to store
critical temporary information in globals.

Correct, but moot in case of Perl signal handling. As my testing a
decade ago shows, *any* Perl signal handler is unsafe; not just for
signal-inside-signal-handler, but actually for any signal handling.
Since what is buggy is the C wrapper which invokes the Perl signal
handler, this is not fixable from Perl code.

I do not remember this problem to be ever fixed.

Do 5.8's safe signals not fix this? The relevant bits of code, from
mg.c:

| Signal_t
| Perl_csighandler(int sig)
| {
| /* get hold of context, reinstall handler if necessary,
| fake default action of exit if necessary */
|
| if (PL_signals & PERL_SIGNALS_UNSAFE_FLAG)
| /* Call the perl level handler now--
| * with risk we may be in malloc() etc. */
| (*PL_sighandlerp)(sig);
| else
| Perl_raise_signal(aTHX_ sig);
| }
|
| void
| Perl_raise_signal(pTHX_ int sig)
| {
| /* Set a flag to say this signal is pending */
| PL_psig_pend[sig]++;
| /* And one to say _a_ signal is pending */
| PL_sig_pending = 1;
| }

Ben
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Ben Morrow
Do 5.8's safe signals not fix this?

Yes, I forgot that 5.8 introduced "unreliable signals". In this mode
(default?) there is no guarantie that a signal will be handled.
However, if it *is* handled, the above mentioned bug should not be
triggered.

Since this semantic is not generally acceptable, I do not consider
this as a solution.

Hope this helps,
Ilya
 
T

Ted Zlatanov

* Guideline 3 says "All system calls should be checked,
including, but not limited to, close, seek, flock, fork and exec."

Is there a comprehensive list of *all* of Perl's system calls?

No, but I highly recommend the Stevens "Advanced Programming in the
UNIX Environment" book to learn about the Unix system calls and much
more. You'll become a better Perl programmer after reading that
book, although it uses C for the examples.

Ted
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top