Why was the "Symbol is a String"-idea dropped?

T

Trans

Subclassing alone would not be enough, to solve the problem above,
also, String#== and Symbol#== would have to be defined such that "a" == :a
And also #hash would have to be defined accordingly.

Then you would still have the two different kinds of objects ("a" and :a)
but they would behave quite the same except for modifying methods.

While I think Symbol probably could use at least few of String's
manipulation methods, putting that aside, I wonder how it would effect
things just to make :a == "a" ?
Now, as I am writing this, I doubt that the advantages
of the unification are really worth doing it...

It depends on factors not known to me.

But now, I think I can understand the core-team's decision better.

Thanks for this excellent summary.

T.
 
L

Logan Capaldo

While I think Symbol probably could use at least few of String's
manipulation methods, putting that aside, I wonder how it would effect
things just to make :a == "a" ?
Well there is precendent, 2 == 2.0 and so on
On the other hand, what should happen in case statements? Maybe it
would acutally be better to make :a === 'a' but not :a == 'a'
 
R

Robert Dober

Well there is precendent, 2 =3D=3D 2.0 and so on
On the other hand, what should happen in case statements? Maybe it
would acutally be better to make :a =3D=3D=3D 'a' but not :a =3D=3D 'a'
Honestly I prefer to write

case s.to_s
when 'a'

instead of
case s
when 'a'

but the most explicit way to do this is maybe the most readable

case s
when :a, 'a'

Cheers
Robert

P.S.
Tom is right that was an excellent resum=E9.
R


--=20
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw
 
D

dblack

Hi --

Well there is precendent, 2 == 2.0 and so on

With symbols being as integer-like as they are string-like, though,
it's really equally similar to:

2 == :"2"
On the other hand, what should happen in case statements? Maybe it
would acutally be better to make :a === 'a' but not :a == 'a'

I guess as long as :a === :a was still true, that might be a good way
to express the fact that "this is the string of which this symbol is a
case", or something like that.


David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black)
(See what readers are saying! http://www.rubypal.com/r4rrevs.pdf)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)
 
L

Logan Capaldo

Hi --



With symbols being as integer-like as they are string-like, though,
it's really equally similar to:

2 == :"2"
I don't think symbols are integer like. (I don't know that they are
especially string like either), but I'd be willing to bet a lot more
code in the wild would be broken if you removed Symbol#to_s vs.
removing Symbol#to_i.

Your example really ought to be

2 == :whatever_symbol_whose_to_i_results_in_2
 
G

Gary Wright

I don't think symbols are integer like.

This is the 'equivalence is defined by identity' idea again. I think
this is what David means by 'integer-like'. It is this property that
both fixnums and symbols share but that is *not* shared by strings.

Making '==' work with mixed operands of symbols and strings breaks that
idea and leads to the strange example that David gave (2 == :"2").

Gary Wright
 
D

dblack

Hi --

This is the 'equivalence is defined by identity' idea again. I think
this is what David means by 'integer-like'. It is this property that
both fixnums and symbols share but that is *not* shared by strings.

Yes, it's the immutable/immediate thing that symbols have in common
with fixnums and that neither has in common with strings.


David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black)
(See what readers are saying! http://www.rubypal.com/r4rrevs.pdf)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)
 
P

Paul Brannan

Yes, it's the immutable/immediate thing that symbols have in common
with fixnums and that neither has in common with strings.

Frozen strings are immutable.

Paul
 
R

Rick DeNatale

What about

%f{This is sooo cooooold} << "!"

TypeError: can't modify frozen string
Just an idea.

That's the immutable part, but

a = "abc".freeze
b = "abc".freeze
c = :abc
d = :abc
a.object_id => -606341628
b.object_id => -606347008
c.object_id => 343218
d.object_id => 343218

The key difference is that there's only one instance of a symbol with
a given string representation.

The shorthand way of saying this is that symbols, like fixnums are
immediate. Which is a sufficent but not necessary condition, it
crosses the line a bit in describing both the identity relationship
requirement AND the implementation.

Most normal objects are referenced at the C level by an internal value
which is a pointer to the objects state representation in memory.
Since objects are aligned at least on a word boudary, all normal
object pointers will have the 2 least significant bits as zero. They
will also be non-zero

A few objects are immediate which means that they are referenced at
the C level by a representation whose value is not a pointer. Fixnums
are represented by shifting the C representation left one bit and
turning on the low-order bit. False is represented by 0, True by 2,
and Nil by 4.

Ruby symbols are represented by a value computed by shifting the
symbols integer representation left 8 bits and setting the low-order
byte to 0xFF representation

As I said, it's not essential that symbols be immediate, for example
interning a string could create a Symbol instance which was frozen and
registered in a global symbol table, i.e. the multiton pattern, but
the current implementation no doubt has some advantages in either
low-level mechanism performance, supporting some niche in ruby legacy,
or both.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

IPMS/USA Region 12 Coordinator
http://ipmsr12.denhaven2.com/

Visit the Project Mercury Wiki Site
http://www.mercuryspacecraft.com/
 
R

Robert Dober

That's the immutable part, but

a = "abc".freeze
b = "abc".freeze
c = :abc
d = :abc
a.object_id => -606341628
b.object_id => -606347008
c.object_id => 343218
d.object_id => 343218

The key difference is that there's only one instance of a symbol with
a given string representation.
Ah I see, I got confused, I did not understand the meaning of
immediate immediately ;).
Although theoretically the interpreter could create an immediate value for
%f{...} we would probably run out of address space :(
Cheers
Robert
 
B

Brian Candler

Most normal objects are referenced at the C level by an internal value
which is a pointer to the objects state representation in memory.
Since objects are aligned at least on a word boudary, all normal
object pointers will have the 2 least significant bits as zero. They
will also be non-zero

A few objects are immediate which means that they are referenced at
the C level by a representation whose value is not a pointer. Fixnums
are represented by shifting the C representation left one bit and
turning on the low-order bit. False is represented by 0, True by 2,
and Nil by 4.

Ruby symbols are represented by a value computed by shifting the
symbols integer representation left 8 bits and setting the low-order
byte to 0xFF representation

Perhaps it varies based on the Ruby version you're running; it's not like
that for me.

irb(main):006:0> :foo.object_id.to_s(16)
=> "39490e"
irb(main):007:0> RUBY_VERSION
=> "1.8.4"

I think a weaker requirement than 'immediate' is needed. A symbol can quite
happily be a regular object; we just need to ensure that there is always
only one symbol for a particular symbol character sequence.

Regards,

Brian.
 
R

Rick DeNatale

Perhaps it varies based on the Ruby version you're running; it's not like
that for me.

irb(main):006:0> :foo.object_id.to_s(16)
=> "39490e"
irb(main):007:0> RUBY_VERSION
=> "1.8.4"

You can't really see the internal bit representations from ruby, since
they get manipulated before you see them. Much like the class of an
object reported by ruby isn't the same as the object pointed to by its
klass pointer at the C level.

And even if you could, I was talking about the integer representation
of the symbol, not the object_id.

Not to say that this doesn't change between versions of ruby. Which
is why it's carefully hidden from ruby code.
I think a weaker requirement than 'immediate' is needed. A symbol can quite
happily be a regular object; we just need to ensure that there is always
only one symbol for a particular symbol character sequence.

Yes, I said that, but the key issue for the subject of the current
thread is that Symbols aren't strings, they might have both a string
representation and an integer representation, but then so do integers,
and unlike Strings they have an essential requirement that equality
implies identity which is an accidental property of integers in the
range of Fixnum.
 
B

Brian Candler

You can't really see the internal bit representations from ruby, since
they get manipulated before you see them. Much like the class of an
object reported by ruby isn't the same as the object pointed to by its
klass pointer at the C level.

And even if you could, I was talking about the integer representation
of the symbol, not the object_id.

AFAIK, the object_id is the in-memory pointer to the structure of the object
(if it's a material object), or is one of the special values:

- 0, 2 or 4 for false, true or nil

- (n<<1) | 1 for Fixnums

None of these is valid as a pointer to a memory location, so they can be
recognised immediately as special.

So in the above, :foo's object ID looks like a memory pointer to me. It
might not be, but then you'd need to guarantee that 39490e could not
possibly be a valid memory pointer for some regular object (and also be able
to recognise this by inspection, i.e. by looking at the bit pattern)

Regards,

Brian.
 
R

Rick DeNatale

AFAIK, the object_id is the in-memory pointer to the structure of the object
(if it's a material object), or is one of the special values:

- 0, 2 or 4 for false, true or nil

- (n<<1) | 1 for Fixnums

Not starting with 1.8.5
VALUE
rb_obj_id(VALUE obj)
{
/*
* 32-bit VALUE space
* MSB ------------------------ LSB
* false 00000000000000000000000000000000
* true 00000000000000000000000000000010
* nil 00000000000000000000000000000100
* undef 00000000000000000000000000000110
* symbol ssssssssssssssssssssssss00001110
* object oooooooooooooooooooooooooooooo00 = 0 (mod sizeof(RVALUE))
* fixnum fffffffffffffffffffffffffffffff1
*
* object_id space
* LSB
* false 00000000000000000000000000000000
* true 00000000000000000000000000000010
* nil 00000000000000000000000000000100
* undef 00000000000000000000000000000110
* symbol 000SSSSSSSSSSSSSSSSSSSSSSSSSSS0 S...S % A = 4
(S...S = s...s * A + 4)
* object oooooooooooooooooooooooooooooo0 o...o % A = 0
* fixnum fffffffffffffffffffffffffffffff1 bignum if required
*
* where A = sizeof(RVALUE)/4
*
* sizeof(RVALUE) is
* 20 if 32-bit, double is 4-byte aligned
* 24 if 32-bit, double is 8-byte aligned
* 40 if 64-bit
*/
if (TYPE(obj) == T_SYMBOL) {
return (SYM2ID(obj) * sizeof(RVALUE) + (4 << 2)) | FIXNUM_FLAG;
}
if (SPECIAL_CONST_P(obj)) {
return LONG2NUM((long)obj);
}
return (VALUE)((long)obj|FIXNUM_FLAG);
}

1.8.6 and 1.9 have the same code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top