Why was the "Symbol is a String"-idea dropped?

Robert Klemme · May 15, 2007

I see.
And I am quite surprised. Because judging from your online activity
you seem to have some experience.
Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

Yeah, maybe. So where are you using symbols where one normally would
use strings?

Not aware? I mean Rails mixes them, right?

I don't use Rails.

))

Readability-wise: precisely what advantage?

If I see a symbol being used as a Hash key I immediately know (or rather
guess) that there is only a limited amount of them and they are known
beforehand, like with options.

# silly example
opts = {
:length => 12,
:width => 30,
}
# other code
resize( opts[:length] )

Whereas when strings are used it's typically stuff that is read from
somewhere, like (another silly example):

ruby -aF: -ne 'BEGIN { $c=Hash.new(0) }; $c[$F[1]]+=1; END { $c.each
{|k,v| print k, "=", v, "\n"}}' /etc/passwd

The only thing that comes to my mind just now, is
that a separated Symbol class easily provides
distinct special values for a parameter that would normally carry a String.

Don't forget the optical distinction between using 'string', "string"
and :symbol.

Yes, I agree.
I am actually interested in the implications for the programmer.
My original question just arised out of the notion
that this implementation decision could have been a move
in a (to my mind) favourable direction.

As we all have different habits what may be favorable for one may be
regrettable for the other.

1. The core structure must of course be large enough, and a large
structure may look impure.

This somehow reminds me of
http://en.wikipedia.org/wiki/Gödel's_incompleteness_theorem

2. But regarding this particular question: My original notion was that
keeping
Symbol and String too separate is not pragmatic.
(I may change my mind on that, if I read more posts like yours,
though.)

Just reread mine a few times - then you don't need the other postings
any more. That's more efficient - you'll save bandwidth and reading is
actually faster if you know the text already.

)

Well, yes, sometimes I'm glad someone tells me that.

) No sweat - following visions is useful as well. As always it's
the mix...

create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike

Click to expand...

Why not? StringLike could even be a module that relies solely on []
and length to do all the non mutating stuff.

Click to expand...

Ah, interesting. Can't follow the implications right now.

For example regexp matching might be implemented similarly for both
(i.e. just in one place). But then again, since RX functionality is
highly integrated into the language that might not be a good idea - or
the C code needs to become more complex to react differently if it sees
a String or Symbol vs. some custom class that includes this module. Hm...

Yes that was the idea behind it: to benefit some and not to hurt the
others.

The next best thing to a win win situation.

)

Indeed, I'm experiencing it right now!
Thanks a lot!

You're welcome. Thank /you/!

Kind regards

robert

enduro (Sven Suska) · May 15, 2007

Ooops!

sorry if I came across rude in any way.

I don't want to "own" the thread.
But I am interested in my question,
so I was glad that someone repeated it,
at a time when all the answers up to that point had not yet answered it.

Robert said:
The fact that the original idea is a big paradigm shift does not
answer your question?

Sorry, no. If someone had told me that this fact was the basis for the
decision of the core team,
that would have answered my question.
(Because the fact alone is not compelling: If a paradigm shift is
possible and good then why not shift?)

And also, I thought that this was the right place for posting the question.
(Actually, until yesterday I didn't know that I could post on ruby-core,
I thought it was just for "cracks", because it's read-only on
ruby-forum.com)

Kind Regards
Sven

And here again, Robert Dober's full text:

I really have not taken offense. However if you are interested in that
only you might post to ruby-core only.
I am kind of surprised that the considerations of Rick and YHS are
considered as OT.
If you do not like them maybe it would be polite to ignore them. But
talking about the topic on *this* list and ignoring all background
information about what symbols are and have been is kind of weird.
Please remember that Ruby has its inheritance in other languages
owning symbols as I believe to have pointed out.
The fact that the original idea is a big paradigm shift does not
answer your question?

I honestly do not understand that.

Threads just evolve I do not feel that they belong to OP .
They do not belong to me either of course .
Cheers
Robert

Another question:
Who is

YHS

?

Regards, Sven

enduro (Sven Suska) · May 15, 2007

Hello again,

Robert said:
Yeah, maybe. So where are you using symbols where one normally would
use strings?

Let me guess, because I don't know if I am really the only one:
1. Multipurpose-names:
Like option-names, used as hash keys but also as names and labels for
the corresponding graphics control etc.
2. Logging:
Giving a brief hint in the form of a symbol (not the log level), well
just because it is easier to type and looks nice

I don't use Rails. ))

Oops

, offending agian, am I?

Readability-wise: precisely what advantage?

Click to expand...

If I see a symbol being used as a Hash key I immediately know (or
rather guess)
that there is only a limited amount of them and they are known
beforehand,
like with options.

# silly example
opts = {
:length => 12,
:width => 30,
}
# other code
resize( opts[:length] )

Sorry, don't get me wrong:

I DID NOT MEAN TO REMOVE the Symbol class.
Nor Symbol literals.

Thus, your examples would be valid and semantically equivalent code
after a "unification" of the classes (regardless if Symbol < String or not).
Or I'd better not call it "unification", I don't have a good word,
perhaps "joining" would be better.

Don't forget the optical distinction between using 'string', "string"
and :symbol.

Also, this won't be affected, see above.

[...] on pragmatism and not purity

Click to expand...

1. The core structure must of course be large enough, and a large
structure may look impure.

Click to expand...

This somehow reminds me of
http://en.wikipedia.org/wiki/Gödel's_incompleteness_theorem

... mystery will always remain ...

Just reread mine a few times - then you don't need the other postings
any more.
That's more efficient - you'll save bandwidth and reading is actually
faster if you know the text already. )

Well,
as Ruby-users,
we don't sacrifice our fun to the god of efficiency, do we...

Cheers,
Sven

Robert Dober · May 15, 2007

Ooops!

sorry if I came across rude in any way.

I don't want to "own" the thread.
But I am interested in my question,
so I was glad that someone repeated it,
at a time when all the answers up to that point had not yet answered it.

Sorry, no. If someone had told me that this fact was the basis for the
decision of the core team,
that would have answered my question.
(Because the fact alone is not compelling: If a paradigm shift is
possible and good then why not shift?)

Sure that was exactly the thing I wanted to discuss and suddenly
someone told me hey stay On Topic. That was strange but not rude at
all. I mean neither Xavier nor you, you are very civilized and polite
people- maybe much more than YHS

I just had the feeling that the answers you will get on this list will
never correspond to your exact question, and I was wrong as Matz
stepped by.

I admit that personally I have a big problem with "A symbol is a
string", but brighter people than me like Tom and Matz have not or did
not have, so maybe indeed I am making too much noise while thinking

.

But please remember too that there are only complicated answers to
simple questions

.

And also, I thought that this was the right place for posting the question.
(Actually, until yesterday I didn't know that I could post on ruby-core,
I thought it was just for "cracks", because it's read-only on
ruby-forum.com)

I definitely should have pointed that out first and than I could have
taken all the time to rant/argue/discuss the technical points, oh boy
how difficult communication can be sometimes!

Kind Regards
Sven

Cheers
Robert

Brian Candler · May 15, 2007

Yes, I agree.
(That's what I tried to address by the two lines after the quote above,
perhaps I should have put a smiley in there )

Yes, of course.
But my point is: Let the system take care of that.
I want a Ruby that just works - crystal-clear, transparently, reliably.

And it already does in most cases. And there is a lot that can be improved.
And one such improvements could be a garbage collection for symbols. (I
think.)

But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

Now, there have been arguments that *all* strings in Ruby should have been
immutable in the first place, and I can sympathise with them. After all,
numbers are immutable, and so are certain other classes. But pragmatically,
there are cases where it is just so *useful* to append to a string. Besides,
maintaining the singleton property is hard for large binary objects - i.e.
when I create another 10MB binary dump, I have to check whether it's the
same as any other object which already exists.

(And of course, very large numbers are Bignums, which are not singletons)

I'm not a core programmer, maybe i am asking to much,
but I think it should be possible without slowing anything down.
One very simple idea I can think of, is the following:
Set a limit to the number of symbols and if it is reached
the GC wil be invoked in a special symbol-mode, marking all symbols that are
still in use and completely re-generates the symbol-table from scratch.

Yes, but why??? In real life, real world programs, only a few hundred unique
method names are used. So let them be symbols.

If you are going to create a million different symbols, or symbols which are
millions of bytes long, then use a String. That's what they are there for!

"Doctor, it hurts when I do this" -- "Then don't do that!"

What you seem to be saying is "I don't want there to be two different types
of object, one for method names and one for holding blobs of data", but I
don't understand this. Symbols work, are fast, and personally I find them
aesthetically pleasing: one is a sort of tag for method names, and one is a
holder of blobs of data which may come from the outside world or from my own
computations.

Yes, I really must admit, I also like the cleanness of current Symbols.
But then, my experience is that this clearness is not worth a lot,
because the border towards "dirty" strings must be crossed often.
(That's why I called sticking to the clearness "temping" in my last post.)

I don't think so. The examples I've seen so far are:

(1) Method names which are created algorithmically. That is, you know you
have a method called "foo" and you want to call another method called
"foo=". It works, where's the problem?

send("#{mname}=")

Yes, you've made a conversion to a string, and back again. Big deal. The
only way to improve this would be to have symbol algebra, e.g.

foo + :=) == :foo=

But internally it would almost certainly be implemented the same way,
because you'd have to look up the symbol ID to convert it into its character
representation, manipulate the characters, and then lookup back into a
symbol.

Or, you'd have to drop symbols entirely and make *every* method call use a
string of characters as the method name - which would be very expensive.

Or, you'd have to make all Strings immutable, so that the the string ID
could be used as a method call tag. See above for reasons why that is
undesirable.

(2) Rails, which allows you to be inconsistent between :foo=>:bar and
:foo=>"bar" and "foo"=>:bar and "foo"=>"bar" (at least sometimes - not
always). IMO it would have been better if Rails had stuck to one or the
other, but that's too late to undo.

Rails has introduced its own bast^H^H^H^Hextensions to the language anyway.

Ruby is not yet good in many other aspects:
speed, threads, documentation.

There is really *excellent* documentation for Ruby. You have to pay for it,
but the books I am thinking of are well worth the money.

You may not like the idea that the language designer and contributors are
not getting any money directly for their work, whilst book publishers are. I
can live with that.

I find that speed is good enough, and threads are better than most (have you
tried writing threaded programs in Perl?)

The language is the crystal. It must be good in the beginning,
it becomes more solid with every project written in that language.

Many people don't seem to realise that Ruby is, what, 15 years old now?

Regards,

Brian.

Robert Dober · May 15, 2007

On 5/15/07 said:
But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

But of course we have immutable strings already

))

class IString < String
def initialize str
super(str)
freeze
end
end

HTIOI (Hope this is of interest

<snip>
Cheers
Robert
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Robert Klemme · May 15, 2007

But of course we have immutable strings already ))

class IString < String
def initialize str
super(str)
freeze
end
end

What advantages does this have over using "freeze" directly?

str = "foo".freeze

It seems using a new class will increase the likelihood of things to break.

HTIOI (Hope this is of interest
LOL

You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Greetings to George, btw.

robert

Rick DeNatale · May 15, 2007

Not responding to any particular posting.

One of the false memes that some folks on this thread seem to hold is
that Symbols are integers.

They aren't.

Any more than they are strings.

A given ruby symbol has both a string and an integer representation,
which can be obtained by using the to_s, and to_i But one would't say
that the object 1.2 is a string because it has a string
representation, or that the object "123" was an integer because it has
an integer representation.

The essential fact about symbols is that if two symbols have the same
string representation they are the same object, and that two different
symbols have two different integer representations. Or more formally

sym1.to_s == sym2.to_s iff sym1.object_id == sym2.object_id
sym1.to_i == sym2.to_i iff sym1.object_id == sym2.object_id

One way to implement this is to keep internal tables which map the
string and integer representations of symbols to each other, and to
have functional mappings between the object_ids and integer
representations of symbols. This is how ruby does it. Creating a
symbol from a string consists of looking for the string in the mapping
from strings to integer representations, and if it's not found
assigning the next integer rep and adding the string and integer rep
to the internal tables. This operation, called interning, happens
either at parse time when :foo is encountered, or later when an
expression like 'foo'.to_sym is executed.

The meme that "Symbols are Integers" probably lingers from an earlier
version of Ruby before there was an actual Symbol class. Back then,
symbols really were instances of Fixnum, but no more. This lives on
vestigially in that Symbol does have a to_int method as well as to_i,
but to_int is deprecated, using it produces a warning :

rick@frodo:~$ ruby -w -e"p :sym.to_int"
-e:1: warning: treating Symbol as an integer
10409

while to_i does not.
rick@frodo:~$ ruby -w -e"p :sym.to_i"
10409

Other languages, like Smalltalk, with similar concepts don't associate
integer representations with Symbols, in these languages the internal
mapping simply maps string representations to object id's, or to the
symbol objects themselves. I suspect that this feature of Ruby symbols
is simply due to the earlier implementation.

Now what are the useful properties of Symbols:

1. Detecting whether or not two symbols are equal is as fast as
comparing their object_ids. This is an O(1) operation.
Detecting whether or not two strings are equal requires a scan of
both strings until either an unequal character is found or the end of
both strings is reached. This is an O(n) operation.
2. Having 1000 'instances' of a symbol with a particular string
representation takes no more space than having 1

Property 1 means that things like hashes with symbol keys are somewhat
faster than hashes with string keys. This is why symbols are used as
method selectors, since dispatching a method call requires repeated
lookup in the method tables going up the inheritance chain. This is a
win if the key is looked up multiple times, there is an initial cost
of interning the symbol (which essentially consists of looking for the
string representation in an internal global symbol table) but this
cost is amortized over subsequent lookups.

It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys, doesn't actually take advantage of this since
it uses symbols converted to strings as the actual keys rather than
the other way around. This provides a bit of syntactic sugar, without
getting either the performance or space advantages of using symbols.

As for incompatibilies caused by the experiment, I'm not sure exactly
what Matz and the core team ran into but certainly this would break
code like:

case arg
when String
# do something
when Symbol
# do something else
end

Code like this exhibits the fragility of doing discrimination based on
classes in the face of refactoring.

Robert Dober · May 15, 2007

What advantages does this have over using "freeze" directly?

Dunno

x = IString.new("Hello World") # Not even tested yet
vs.
x="HelloWorld".freeze

Well the first one has the advantage that I thought about it

Now I reckon that the subclass stuff is baaad

def blah str
raise ArgumentError unless IString === str
...
end

but now someone does
class MString < IString
get rid of the freeze (by calling superclass.superclass.new in
self.class.new e.g)
end

and my code is broken, while in

def blah str
raise ArgumentError unless str.respond_to? :frozen && str.frozen?
...
end

frozen is frozen forever.

So do what Robert told you and beware of what Robert told you

str = "foo".freeze

It seems using a new class will increase the likelihood of things to break.

Greetings to George, btw.

Well last time I met him he was admiring your posts to the list

robert

idem

Robert Dober · May 15, 2007

On 5/15/07 said:
It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys, doesn't actually take advantage of this since
it uses symbols converted to strings as the actual keys rather than
the other way around. This provides a bit of syntactic sugar, without
getting either the performance or space advantages of using symbols.

Is this whole String vs. Symbol idea motivated by Rails stuff?
I just do not know Rails but I would guess it is a dangerous thing if
paradigms that are useful in an application framework - even if it is
such a Great One as Rails - are to be applied to a General Purpose
Language.

I will rephrase OP's question now, why the h[ae]ck did the Core team
think about unifying Strings and Symbols in the first place ???
That is for sure something very interesting.
<more stuff snipped>

Robert

Brian Candler · May 15, 2007

But of course we have immutable strings already ))

class IString < String
def initialize str
super(str)
freeze
end
end

Yes, but it's not a singleton.

It would only be of interest as a Symbol replacement if IString.new("foo")
always returned the same object. You could implement this using the Multiton
pattern I think.

Then you could safely use IString#object_id as a method name key.

Regards,

Brian.

Brian Candler · May 15, 2007

Yes, but it's not a singleton.

It would only be of interest as a Symbol replacement if IString.new("foo")
always returned the same object. You could implement this using the Multiton
pattern I think.

Then you could safely use IString#object_id as a method name key.

P.S. I'm aware of Symbol#to_i, but to_i and object_id appear to be
intimately related:

irb(main):001:0> :foo.to_i
=> 14817
irb(main):002:0> :foo.object_id
=> 148178
irb(main):003:0> :bar.to_i
=> 16081
irb(main):004:0> :bar.object_id
=> 160818
irb(main):005:0> :zzzzzzzzzzzzzzzz.to_i
=> 16089
irb(main):006:0> :zzzzzzzzzzzzzzzz.object_id
=> 160898
irb(main):007:0>

uts.to_i
=> 7345
irb(main):008:0>

uts.object_id
=> 73458
irb(main):009:0>

i.e. I don't think the symbol table maintains an explicit integer key for
each symbol.

Robert Klemme · May 15, 2007

Dunno

x = IString.new("Hello World") # Not even tested yet
vs.
x="HelloWorld".freeze

Well the first one has the advantage that I thought about it

Now I reckon that the subclass stuff is baaad

def blah str
raise ArgumentError unless IString === str
...
end

but now someone does
class MString < IString
get rid of the freeze (by calling superclass.superclass.new in
self.class.new e.g)
end

and my code is broken, while in

def blah str
raise ArgumentError unless str.respond_to? :frozen && str.frozen?
...
end

frozen is frozen forever.

Corrent. And since #frozen? is defined in Kernel you can skip the first
test.

So do what Robert told you and beware of what Robert told you

Well last time I met him he was admiring your posts to the list

Wow! So he didn't die but just went home like this other guy who
invented a vi clone (or at least provided his name for the operation)...

idem

While we're at it: *if* you want to define something (and are a fan of
C++) you can do this:

irb(main):001:0> module Kernel
irb(main):002:1> private
irb(main):003:1> def const(*a) a.each {|x| x.freeze } end
irb(main):004:1> end
=> nil
irb(main):005:0> nil
=> nil
irb(main):006:0> foo, bar = const "foo", "bar"
=> ["foo", "bar"]
irb(main):007:0> ["foo", "bar"]
=> ["foo", "bar"]
irb(main):008:0> foo << bar
TypeError: can't modify frozen string
from (irb):8:in `<<'
from (irb):8
from :0
irb(main):009:0> bar << foo
TypeError: can't modify frozen string
from (irb):9:in `<<'
from (irb):9
from :0
irb(main):010:0>

Hihi...

Kind regards

robert

Gary Wright · May 15, 2007

Yes, but it's not a singleton.

You've stated or implied a couple of times in this discussion that
symbols are 'singletons', but I thought the conventional definition
of 'singleton' was of a class with only a single instance, where the
instance is called a singleton. That doesn't describe Ruby's symbols.

I think what you are getting at is the idea that identity and
equality are one and the same for symbols. Fixnum instances also
have this property but floats don't. Is there a standard term for
that characteristic? I think in mathematics it would be an equivalence
relation ~ such that If x ~ y then x = y for all x, y in the set.
In this case ~ represents Ruby's == and = represents Ruby's equal?.

Robert Dober · May 15, 2007

On 5/15/07 said:
Corrent. And since #frozen? is defined in Kernel you can skip the first
test.

No, you are an optimist Robert

irb(main):003:0> Kernel.send :remove_method, :frozen?
=> Kernel
irb(main):004:0> "a".frozen?
NoMethodError: undefined method `frozen?' for "a":String
from (irb):4
from :0

But maybe we should not worry too much about that kind of meta-hackery
in our design, because one could trick as anyway, e.g.

class String; def frozen?; true end end

So you are right after all ;-)

Cheers
Robert

Robert Dober · May 15, 2007

On 5/15/07 said:
Wow! So he didn't die but just went home like this other guy who
invented a vi clone (or at least provided his name for the operation)... =

Your conclusions are jumped

But sure would have liked to talk to this guy. As to G=F6del or
Hemingway, well maybe I am OT *now*.

<snip>

While we're at it: *if* you want to define something (and are a fan of
C++) you can do this:

irb(main):001:0> module Kernel
irb(main):002:1> private
irb(main):003:1> def const(*a) a.each {|x| x.freeze } end
irb(main):004:1> end

hey that is quite nice!!!

Hihi...

Kind regards

robert

--=20
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Rick DeNatale · May 15, 2007

Is this whole String vs. Symbol idea motivated by Rails stuff?

I will rephrase OP's question now, why the h[ae]ck did the Core team
think about unifying Strings and Symbols in the first place ???

I don't know. Probably not motivated, but on the other hand it no
doubt stimulated a reconsideration of the relationship between String
and Symbol.

Whether or not Strings and Symbols have an inheritance relationship is
a bit of an accidental design choice. Keeping in mind that in a
language like Ruby or Smalltalk, the class hierarchy is really about
implementation factoring and not type specification, as a first
approximation, it doesn't matter that much. In Smalltalk-80 Symbol is
a subclass of String, but I believe that Symbol overrode the methods
which mutate the instance to cause errors.

But once the decision was made, secondary effects ensue. If
programmers write code which depends on a particular inheritance
relationship like the case statement in my earlier post, then changes
to the decision will break things. It's like the story about how
Stewart Feldman decided to use tab as a lexical element in makefiles
and treat them differently from the equivalent whitespace. He
realized that this was a bad decision, but too late.
From: http://www.faqs.org/docs/artu/ch15s04.html

"No discussion of make(1) would be complete without an
acknowledgement that it includes one of the worst design botches
in the history of Unix. The use of tab characters as a required leader
for command lines associated with a production means that the
interpretation of a makefile can change drastically on the basis of
invisible
differences in whitespace.

Why the tab in column 1? Yacc was new, Lex was brand new. I hadn't
tried either, so I figured this would be a good excuse to learn. After
getting myself snarled up with my first stab at Lex, I just did something
simple with the pattern newline-tab. It worked, it stayed. And then a
few weeks later I had a user population of about a dozen, most of them
friends, and I didn't want to screw up my embedded base. The rest,
sadly, is history.
-- Stuart Feldman

Not that I'm saying that Matz's decision on Symbol not being a
subclass of String was a bad one, I'm not, and it's certainly not in
the class of the tab/whitespace 'decision' in make. What I am saying
is that once made these decisions can quickly generate their own
requirements to exist once a user base has been established.

Rick DeNatale · May 15, 2007

P.S. I'm aware of Symbol#to_i, but to_i and object_id appear to be
intimately related:

irb(main):001:0> :foo.to_i
=> 14817
irb(main):002:0> :foo.object_id
=> 148178
irb(main):003:0> :bar.to_i
=> 16081
irb(main):004:0> :bar.object_id
=> 160818
irb(main):005:0> :zzzzzzzzzzzzzzzz.to_i
=> 16089
irb(main):006:0> :zzzzzzzzzzzzzzzz.object_id
=> 160898

Here's part of the ruby1.8.5 code which computes an objects object_id
from its reference value.

if (TYPE(obj) == T_SYMBOL) {
return (SYM2ID(obj) * sizeof(RVALUE) + (4 << 2)) | FIXNUM_FLAG;
}

where SYM2ID is a c macro which shifts the value right 8 bits.

And here's the code for Symbol#to_i
static VALUE
sym_to_i(sym)
VALUE sym;
{
ID id = SYM2ID(sym);

return LONG2FIX(id);
}

i.e. I don't think the symbol table maintains an explicit integer key for
each symbol.

Actually it does, based on having recently read the ruby 1.8.5 code.

It keeps two internal hashes, one maps the string representation to
the integer representation, and the other maps the other way around.

The code for String#to_sym basically does this:

it calls rb_intern to get the integer representation called id, and returns
ID2SYM(id) which just returns id shifted left 8 bits, in other
words it's the inverse of SYM2ID.

rb_intern searches for the string in the symbol table and returns
the id found there if it finds it.

otherwise, it calculates the integer representation by shifting the
next available id left by 3 bits and oring in some flag bits which
depend on the contents of the string, for example if the string starts
with a single "@" it's flagged as an instance variable name,

It then makes a copy of the string and does the equivalent of
sym_table[stringcopy] = newly_computed_id
sym_rev_table[newly_computed_id] = stringcopy

Although these two aren't ruby hash objects but c hash tables.

FWIW, Ruby hash object use the same c hash code internally.

What's interesting is that a reference to a symbol doesn't actually
point to an allocated object.

Brian Candler · May 15, 2007

You've stated or implied a couple of times in this discussion that
symbols are 'singletons', but I thought the conventional definition
of 'singleton' was of a class with only a single instance, where the
instance is called a singleton. That doesn't describe Ruby's symbols.

I think what you are getting at is the idea that identity and
equality are one and the same for symbols.

No, that's not exactly what I meant, but sorry for not being more precise.
What I meant was: there is only ever one symbol object in existence for a
particular sequence of characters. :foo.object_id in one part of the program
is always the same as :foo.object_id elsewhere.

If it were Symbol.new("foo") always returning the same object then I guess
it would probably be called the multiton pattern.

Regards,

Brian.

Sven Suska (enduro) · May 16, 2007

Hello everybody,

Although not a lot from the Ruby-Core specialists,
but still I have learned a lot from the discussion.
I am trying build a conceptual picture now.

Some say Strings and Symbols are conceptuelly very different
some say they are quite close.

I view it like this:
Symbols essentially are names, Strings essentially are data,
while they both appear as sequences of characters.

Names/Symbols are just atomic, constant, unrelated entities,
while Strings as data have a rich life, they can be related in
many ways they can be analysed, even be modified.

That's a clean distinction and I think it is very well-represented
in the current Ruby implementation.

It this light, it seems nonsensical to make one the subclass of the other.
(A common superclass would be OK, though.)

Now, in practice, the situation gets more complex:
1. Names sometimes turn into data (option names, method names, table
names...),
especially when things get highly dynamic.
2. Sometimes, programmers to use the conceptually "wrong" class, maybe
as a kind of optimization, for the sake of beauty or out of lazyness
...

One could argue that it is good that Symbol and String are well-separated,
because it educates programmers to decide for the "correct" class to use.

On the other hand, the following situation occurs very very often:
You need to transfer a sequence of characters -- which format do you use
always Symbols, always Strings, should it allow both? (Or even a fancy
object)

First, you could argue that when you use duck-typing, the interface can
be kept open.
But still, many situations remain, where this question is remains.

This choice can be a burden, especially if you think of
inter-operability or optimisation.

And that is an argument for some sort of unification of Symbol and String.

Subclassing alone would not be enough, to solve the problem above,
also, String#== and Symbol#== would have to be defined such that "a" == :a
And also #hash would have to be defined accordingly.

Then you would still have the two different kinds of objects ("a" and :a)
but they would behave quite the same except for modifying methods.

Now, as I am writing this, I doubt that the advantages
of the unification are really worth doing it...

It depends on factors not known to me.

But now, I think I can understand the core-team's decision better.

Bye
Sven

Brian said:
No, that's not exactly what I meant, but sorry for not being more precise.
What I meant was: there is only ever one symbol object in existence for a
particular sequence of characters. :foo.object_id in one part of the program
is always the same as :foo.object_id elsewhere.

If it were Symbol.new("foo") always returning the same object then I guess
it would probably be called the multiton pattern.

Isn't the term "immediate value" used for that? Like:
:abc is an immediate value, and so is 12, so is nil
"abc" is a reference value und so is [1, 2] and also {} and even 12.0

String as Subclass of Symbol?	8	Oct 24, 2005
Adventures in Optimization... or why CONST frozen is Good	4	Dec 1, 2008
[was: comp.lang.c.moderated]Re: How C is better then C++	5	Dec 22, 2009
Why is Ruby a favorite among the Agile set?	9	Oct 23, 2005
What Exactly is Rubinius & Why Such Importance..?	7	Jan 25, 2008
Why the performance of my string formatting code (via snprintf /	11	Jun 25, 2010
"Why I Program In Ruby (And Maybe Why You Shouldn't)"	50	Nov 26, 2007
Is storing connection string in a session variable a good idea?	1	Apr 1, 2006

Why was the "Symbol is a String"-idea dropped?

Robert Klemme

enduro (Sven Suska)

enduro (Sven Suska)

Robert Dober

Brian Candler

Robert Dober

Robert Klemme

Rick DeNatale

Robert Dober

Robert Dober

Brian Candler

Brian Candler

Robert Klemme

Gary Wright

Robert Dober

Robert Dober

Rick DeNatale

Rick DeNatale

Brian Candler

Sven Suska (enduro)

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads