Perl values interchangeability?

I

Ivan Shmakov

Do I understand it correctly that no Perl program could discern
$a from $b, as long as ($a eq $b && ref ($a) eq ref ($b)) holds?

(In particular, isn't there a certain module which allows one to
peek into the Perl value's internal representation, such as
whether the value has the "number" and "string" parts?)

The context is that I'm implementing a span to value mapping,
and wonder if I could allow the code to optimize, say:

$map->set ($a, $b, $value); # $a < $b < $c < $d
$map->set ($c, $d, $value);
$map->set ($b, $c, $value);

so that $map will internally hold the same information as if a
single ->set ($a, $d, $value) was performed.

TIA.
 
P

Peter Makholm

Ivan Shmakov said:
Do I understand it correctly that no Perl program could discern
$a from $b, as long as ($a eq $b && ref ($a) eq ref ($b)) holds?

Some operators behave differently based on whether the value of a scalar
has been interpreted as a string. Mainly the bit-wise logical operators
and the smart-match operator


Some values have specifically different content of the string part and
the number part of the value. For example $!. The following code would
make your assumption false:

close(42);
$a = $!;
$b = "Bad file descriptor";

If $a is later used into a integer context it would have the value 9,
while $b would have the value 0. You can make these special values your
self with Scalar::Util::dualvar().


Tied values and objects with their own stringification will of course
also break your test.
(In particular, isn't there a certain module which allows one to
peek into the Perl value's internal representation, such as
whether the value has the "number" and "string" parts?)

Devel::peek will let you dump the internal representation of a value.
The context is that I'm implementing a span to value mapping,
and wonder if I could allow the code to optimize, say:

$map->set ($a, $b, $value); # $a < $b < $c < $d
$map->set ($c, $d, $value);
$map->set ($b, $c, $value);

A sane API would only depend on the numerical value or the string value
for non-reference scalars. So I would ignore most of the above issues
unless the documentation for a API specifically mentions it.

//Makholm
 
I

Ivan Shmakov

(... Assuming both are defined ()...)
Some operators behave differently based on whether the value of a
scalar has been interpreted as a string. Mainly the bit-wise logical
operators and the smart-match operator.

AIUI, these behave differently depending on whether a value is
the /result/ of a string or numeric operation, or is a constant.
In particular:

my $a = 42; # $a is numeric
my $b = $a . ""; # $a was /interpreted/ as a string, yet still a number
my $c = $b + 0; # $b was /interpreted/ as a number, yet still a string
Some values have specifically different content of the string part
and the number part of the value. For example $!. The following
code would make your assumption false:
close(42); $a = $!; $b = "Bad file descriptor";
If $a is later used into a integer context it would have the value 9,
while $b would have the value 0. You can make these special values
your self with Scalar::Util::dualvar ().

Do I understand it correctly that both of the values could be
checked explicitly, as in:

sub same_val {
my ($a, $b) = @_;
## .
return (defined ($a) && defined ($b)
&& ref ($a) eq "" && ref ($b) eq ""
&& $a eq $b
&& $a == $b);
}
Tied values and objects with their own stringification will of course
also break your test.

And how do I check for these (and especially the tied values)?

For instance, will the following test suffice for objects
implemented as references?

use Scalar::Util qw (refaddr reftype);

sub same_ref {
my ($a, $b) = @_;
my ($x, $y) = (reftype ($a), reftype ($b));
## .
return (defined ($x) && defined ($y)
&& $x eq $y
&& refaddr ($a) eq refaddr ($b));
}

[...]
A sane API would only depend on the numerical value or the string
value for non-reference scalars. So I would ignore most of the above
issues unless the documentation for a API specifically mentions it.

ACK, thanks.

Alas, the intent is to make the resulting object sufficiently
generic, and "hash-like," and it'd hardly be acceptable for a
hash to preserve only the string or numerical value of, well,
the value. And, for instance, $a{"x"} = $!; preserves both:

my %a = ();
close (42);
$a{"x"} = $!;
print (join (", ", $a{"x"} + 0, $a{"x"} . ""), "\n");
## => 9, Bad file descriptor

Thus, unless it'd be possible to prove that the values cannot be
distinguished at all (if they satisfy the same_ref () predicate,
for instance; or?), I'd have to keep all of them.
 
I

Ivan Shmakov

Ben Morrow said:
[...]
(... Assuming both are defined ()...)
[...]
my $c = $b + 0; # $b was /interpreted/ as a number, yet still a string
Nope. $b is now also POK and IOK, so the bitwise ops will now treat
it as a number. (This is why it was a bug for them to make the
string/number distinction in the first place: Perl just doesn't
preserve that information.)

! That was, for the lack of a better word, unexpected.

[...]
All undefined values are equal,

That's a guard against otherwise possible ("" eq undef) below,
used instead of a proper conditional branch for the sake of
simplicity. (Please note that my original predicate was
assuming defined () values, too.)
at the Perl level at least.

... Moreover, an application may choose to use undef as an
"unknown" value. Thus: same ($a, undef) == same (undef, $b) ==
same (undef, undef) == undef. (Although it's not the behavior
I'm interested in in this case.)
It's usual in Perl to treat the string value as 'canonical'. Objects
or values which have the same string value but are in some important
way 'different' are basically breaking the rules. (The string value
is what Test::More::is tests, for instance.)

ACK, thanks.
For tied values, tied (). For overloaded objects, see the overload
documentation; however you've already rejected refs above, so you
won't see objects of any kind.

To clarify it, the predicate above was intended to be used for
the "simple" values only, and be combined with same_ref () below
if necessary.
There are other kinds of magic, but they're rare and probably not
worth worrying about.

Curiously, is $! one of them? FWIW, while Scalar::Util(3)
states that it's a tied scalar, tied ($!) is undef.

[...]
Objects are always 'implemented as references' in Perl. (More
specifically, an object only behaves as an object when manipulated
via a reference. A 'bare' blessed value, obtained by dereferencing
an object ref, is not distinguishable from a plain value.)
[...]

Are you testing for 'object equivalence' or 'object identity'?

The latter.
If you're testing for identity (are these the same object?) then
refaddr is necessary and sufficient.

... I wonder if it's codified somewhere in the documentation?
I understand that on a J. von Neumann architecture there's
hardly any other sensible way to implement Perl's objects and
references, yet...
If you want some other form of equivalence, use the relevant
equivalence operator (eq or ==).

ACK, thanks.

[...]
(I assume here that the values you are comparing are the '$value's in
each case? That is, given
$map->set(0, 2, $a); $map->set(1, 3, $b);
you want to know if $a and $b are sufficiently 'equal' that you can
merge this into
$map->set(0, 3, $a);

Yes.

[...]
Thus, unless it'd be possible to prove that the values cannot be
distinguished at all (if they satisfy the same_ref () predicate, for
instance; or?), I'd have to keep all of them.
You can definitely merge the entries if
ref($a) && ref($b) && refaddr($a) == refaddr($b)
since in that case they are both refs to the same referent. You
obviously can't merge them if one is a ref and the other isn't.
For the case of two non-refs, I would probably just compare them with
'eq' and document that fact, but if you are worried about special
cases probably the simplest solution is to pass both values through
Storable and compare the results. If either value is tied you can't
assume anything.
It would probably also be worth providing a user-specified 'merge'
function, so that the user can choose to allow merges more often
(say, if they have some class of objects with a non-obvious
equivalence relation).

I guess that allowing for a user-specified equality / merge
predicate (to be used unless the values are refaddr-equal) will
be sufficient for this one. Thanks!
 
P

Peter Makholm

Ben Morrow said:
It's usual in Perl to treat the string value as 'canonical'. Objects or
values which have the same string value but are in some important way
'different' are basically breaking the rules.

I am not sure where you got that rule from. I would never assume that
the stringification of objects follows any other rule that "Whatever
makes most sense for presentation to a User".

This might define sensible equivalence classes, but I would never
assume that it would give me referential equality (use refaddr) or a
strict value equality (use Test::Deep::NoTest).

//Makholm
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top