Symbols vs Strings

Discussion in 'Ruby' started by matt, Dec 21, 2006.

  1. matt

    matt Guest

    Two quick questions:

    1) Could someone expand on what a symbol is?
    The Programming Ruby book seems to outline that a symbol is a string
    with a colon in front of it.
    The Agile Rails book calls it "the thing named .. "

    2) If a Symbol is string substitution, why use symbols.

    I feel I'm missing something obvious.

    I suspect that getting #1 fully answered, will indirectly answer #2.

    Thanks

    Matt
     
    matt, Dec 21, 2006
    #1
    1. Advertising

  2. matt wrote:
    > 1) Could someone expand on what a symbol is?

    Ridiculously long explanation follows. Composed of all the answers from
    some previous symbol vs string thread.

    h1. On Symbols

    *So, I've been blindly typing things like "attr_accessor :elephant" and
    "link_to :action => 'hemorrhage'" for a while, and I've started to get
    annoyed with that colon syntax. Somebody told me that :elephant and
    :hemorrhage are Symbols, but I've no clue what that is or means. WTF, mate?*

    Symbol. That's the name of the class. We can see that this way:

    a = :foo
    a.class #=> Symbol

    Yes, see, Symbols are objects, and can be treated like objects. You can
    point variables to them, you can pass them as parameters, you can return
    them from blocks and methods, and you can invoke methods on them.

    They're quite simple, really.

    *Uhhh.... I was, sort of, looking for more information than that.*

    Well, there's a veritable cornucopia of ways in which I can attempt to
    share the essense of Symbol with you. First, there is the source code to
    the Ruby interpreter, which is, of course, the authoritative source on
    this matter. And /as/ the One True Source is the Ruby source code, what
    follows is a list of descriptions, analogies, observations, and
    generalizations of symbols that attempt to communicate as much of their
    being as you desire to know, without communicating their being in whole.

    Note: Nobody can learn you but yourself. These words attempt to provide
    facts, explanations and perspectives that may help you on your journey
    to understanding Symbols -- but you /will/ have to do some work on your
    own, whether that be experimentation in irb, or deep, pensive
    introspection of the meaning of programming. To quote [a translation of]
    <a href="http://en.wikiquote.org/wiki/Plutarch">Plutarch</a>:

    <blockquote>We must encourage [each other] -- once we have grasped the
    basic points -- to interconnecting everything else on our own, to use
    memory to guide our original thinking, and to accept what someone else
    says as a starting point, a seed to be nourished and grow. For the
    correct analogy for the mind is not a vessel that needs filling but wood
    that needs igniting -- no more -- and then it motivates one towards
    originality and instills the desire for truth. Suppose someone were to
    go and ask his neighbors for fire and find a substantial blaze there,
    and just stay there continually warming himself: that is no different
    from someone who goes to someone else to get to some of his rationality,
    and fails to realize that he ought to ignite his own flame, his own
    intellect, but is happy to sit entranced by the lecture, and the words
    trigger only associative thinking and bring, as it were, only a flush to
    his cheeks and a glow to his limbs; but he has not dispelled or
    dispersed, in the warm light of philosophy, the internal dank gloom of
    his mind.</blockquote>

    *Gawsh, that sounds dangerous.*

    It's really not. I just put that disclaimer in there to weed out the
    slackers. I'll also add, that to properly glean information from this
    page, you should already understand:

    * Object-oriented programming
    * The "variables are references" way of programming that infuses Ruby
    * Many other aspects of Ruby, such as what attr_accessor does.

    *All right, then. So... what _are_ these "many ways" to teach me about
    Symbols?*

    Yeah, right. Thanks. They are:

    1. A list of the Symbol's basic properties.
    1. Example code for their common usages, a discussion of the
    similarities and differences between symbols and their substitutes, and
    why symbols exist.
    1. An analogy to concepts from other programming languages.
    1. A list of some important implementation details behind symbols.
    1. When you might *not* want to use Symbols (I know, blasphemy).
    1. The gory details of their implementation.
    1. Links to other explanations.

    You can pick and choose from this menu as you like. And away we go!

    *Wow, you know what I just realized? The scrollbox on the right is
    frikkin' tiny. This document is huge! I don't want to read all this.*

    Well, you should've thought of that before you decided not to understand
    Symbols. Also, you can stop reading as soon as you understand Symbols
    (but not a second earlier).

    h2. A list of the symbols basic properties.

    A Symbol literal, in code, is a colon followed by a bare word (/\w+/, in
    general, though the regex is in fact more complex than this -- see the
    gory implementation details if you really care).

    A Symbol's properties can be summed up thusly:

    :apple == :apple &&
    :apple.to_s == 'apple' &&
    :apple.to_i == 23417

    :apple is A literal reference to a Symbol object, just as 5 is a literal
    reference to the number 5, and "garden" is a literal reference to the
    eponymous String object. I wouldn't worry too much about asking what
    :apple is "wrapping" or whatnot -- :apple is :apple, as is evident from
    line 1.

    Line 1 says that we can compare Symbols using the == operator (aka
    Symbol#==). The == operator returns true whenever the literal references
    look the same (that is, in the source code, :apple is :apple is :apple,
    but not :donkey or even :APPLE).

    *So a Symbol literal is no different a String literal, eh?*

    False. Yeah, it looks like that, but it there are a lot of ways in which
    it isn't. For one, Strings come with methods galore, like gsub and slice
    and capitalize. Symbol comes with:

    ===, id2name, inspect, to_i, to_int, to_s, to_sym

    You really can't do a whole lot with Symbols, and you're not supposed to.

    *So Symbols are just Strings with all the useful methods removed?*

    No. Take away Symbol#id2name, Symbol#to_s, and Symbol#to_i and you still
    have something useful to Rubyists -- the ability to test for equality.
    Here, Symbols look like strings only to the programmer. To the program,
    they look like boring, ineffectual objects with which you can do nothing
    but test for equality.

    *So is that _all_ you can do with a Symbol?*

    Well, equality is a big one, and covers many standard usages of Symbols
    in Ruby. See the Examples section for details.

    You can also convert a Symbol to a String. While Symbols aren't Strings,
    they have a close bond with the String class, partly due to Symbol#to_s.
    This returns a new String containing whatever you typed following the
    colon. (It, of course, does not affect the original object. Short of <a
    href="http://rubyforge.org/cgi-bin/viewcvs.cgi/evil/lib/evil.rb?root=evil&view=markup">evil.rb</a>,
    Ruby code cannot change the class of a given object.)

    You can also convert a Symbol to an Integer. (To be honest, I'm not
    quite sure if it's useful to anybody. I've never used it, certainly.)
    According to the rdoc on Symbol#to_i, :apple.to_i will equal 23417 for
    the life of my Ruby program, no matter how many times I type it. Go
    ahead, pop open an irb window and try it out.

    As a matter of fact, try all of these things out in irb. Set variables
    equal to symbols, pass them to methods and blocks, invoke some methods,
    go crazy!

    So, you should see, by now, that you can't do a lot with Symbols. You
    can reference them through the funny :goatee syntax, you can compare
    them for equality, and you can convert them to Strings and Integers.

    (Truth be told, you can do a few other things, but they fall out of the
    99.9% of use cases for Symbols. After understanding Symbols, peruse the
    section on [non-gory] implementation details for more.)

    *Okay, that made no sense.*

    Well, give it a think some more, or try one of the other sections. I
    don't mind; I'm just an HTML document after all.

    *No, I mean, if symbols don't /do/ anything, then why the <expletive
    deleted> have them in the language in the first place?*

    Well, read the next section.

    h2. Example code for their common usages, a discussion of the
    similarities and differences between symbols and their substitutes, and
    why symbols exist.

    Symbols are typically used where identity is concerned. Yeah, I know
    that's vague. Here's some specific cases:

    * Referring to variable or method names
    * As keys to a Hash (often when doing that named parameter trick, as
    in Rails)
    * To refer to a specific set of things, such as :up, :right, :left, :down.

    Now let's get down and dirty with real raw code, for each of these in turn:

    # Referring to variable or method names
    class MyJob
    attr_writer :frustration_level #refers to the method names to be created
    def print_var(sym)
    puts "#{sym} = #{instance_variable_get(sym)}" #refers to the
    variable name to be accessed
    end
    end
    java_code = MyJob.new
    java_code.frustration_level = "bordering on suicide"
    java_code.print_var:)@frustration_level)

    # As keys to a Hash (often when doing that named parameter trick, as
    in Rails)
    connection = { :host => 'eat.mcdonalds.com', :port => 443 } #as keys
    to a hash
    link_to :action => 'free_willy' #that named parameter trick

    def link_to(hashy_thing) #implementing that named parameter trick
    do_something_with(hashy_thing[:action])
    end

    # To refer to a specific set of things, such as :up, :right, :left, :down.
    class Pos
    attr_accessor :x, :y #looks familiar...
    def initialize(x,y) @x,@y = x,y end
    def move(dir)
    case dir
    when :up then @y += 1
    when :down then @y -= 1
    when :left then @x -= 1
    when :right then @x += 1
    end
    self #return self to make irb sessions friendlier, and to allow
    chaining, i suppose
    end
    end
    pos = Pos.new(0,0) #x = 0, y = 0
    pos.move :up #x = 0, y = 1
    pos.move :left #x = -1, y = 1

    Take some time. Read through the code slowly. Swish it around in your
    mouth. Change some things and see what happens. Employ irb. Now pause.

    Okay. Understand?

    *Wait, why do attr_accessor and instance_variable_get require colons in
    front of your identifiers, while alias and defined? do not?*

    alias and defined? are reserved Ruby keywords. The Ruby parser notices
    the keyword, and knows that the next token better be a method/variable
    name, or else.

    attr_accessor and instance_variable_get, however, are not reserved Ruby
    keywords. They are built-in methods, provided through the Kernel module.
    Because they are methods, the syntax for invoking them is the same as
    for invoking any other method. If you were to do:

    attr_writer frustration_level

    Ruby would first look for a local variable named 'frustration_level',
    and then, failing that, invoke the 'frustration_level' method on self,
    and passing the *return value* to attr_writer (or, more likely, a
    NoMethodError will whizz by). We don't want that. Instead, we're using
    Symbols as a pretty way to pass in the _name_ of the method we want to
    create. We pass [a reference to] the Symbol into attr_accessor, and then
    attr_accessor invokes #to_s to find out what you typed after the colon.

    Okay, do you understand /now/?

    *Well, I think so, but -- couldn't you have just used Strings everywhere
    for the same purpose? And for the move up/down thing, you could have
    just created some UP = 1, DOWN = 2 style constants, or, heck, make four
    different methods -- move_up, move_down, etc.!*

    Yes, I could have.

    *Uh... ?*

    For all of the above cases (and all of the ways in which I've seen
    Symbols applied), you could use Strings in their place. This is because,
    well, what are we doing? We're comparing for equality (as with the case
    statement, or the Hash access), or we're calling #to_s to find out its
    name (as with the attr_accessor thing). These are both things we can do
    with Strings.

    So why use Symbols instead?

    1. To signal intent. By sticking colons in front of these bare words,
    you're saying, "These are the absolutes in my code. These are the things
    that do not change. In my application, these are not messages to the
    user, tokens to parse for, or anything else that's String-like. These
    are *concepts*."
    2. (On a related note...) For readability's sake. If you're using a
    text editor with syntax highlighting, the advantage of saying <%=
    link_to :controller => 'dingleberry', :action => 'pick' %> over <%=
    link_to 'controller' => 'dingleberry', 'action' => 'pick' %>, amidst a
    sea of RHTML, is IMMEDIATELY obvious (no pun intended, Ruby veterans).
    Even if you're not, using symbols in the right places can still aid your
    eyes in knowing where to look, and reduce line noise.
    3. Slight performance improvement. In cases like the above, where a
    fixed number of symbols are used over and over, you can get a slight
    performance improvement using Symbols rather than Strings. In a typical
    application, it's more likely to be negligable. Also, there seem to be
    cases where using Symbols is the slightly *less* performant thing to do,
    so I wouldn't dwell on this bullet much.
    4. Because they're cool, and all the cool Ruby cats are doing it. Be
    careful about this one, too, as there are many cases where Symbols don't
    make sense. No Golden Hammer For You.

    As for the UP = 1 / DOWN = 2, move_up / move_down suggestions: well,
    that's just icky.

    h2. An analogy to concepts from other programming languages.

    So your brain doesn't think in pure Ruby, yet? That's a shame. It's
    really a fun experience.

    Ruby's symbols are most analagous to Lisp's symbols, I'm told. They are
    also comparable to Java's interned Strings (available through the
    String.intern() method), but chances are you haven't heard of
    String.intern(), so I doubt that'll gain you much insight.

    What might be more useful, rather, is to compare their usages to similar
    things you'd be doing in other languages, like, oh, C or Java.

    *Java:*

    Where as attr_accessor method could take a String or a Symbol, the
    Reflection API in Java uses Strings as parameters for method names,
    so... yeah, that's that one.

    The equivalent of the above Hash example would likely be Strings as keys
    to Map (or Map-like) objects. This is the case for Properties and
    ServletRequest.getParameters(), for example, and using the Properties
    class is often a trick you might employ to pass freeform configuration
    lists into your *own* methods.

    The up/down/left/right thing general has quite an odeous parallel in
    Java 1.4:
    class Pos {
    static public final UP = 1;
    static public final DOWN = 2;
    static public final LEFT = 3;
    static public final RIGHT = 4;
    ...

    Ouch. Java 5 added enums, so, you know, less pain.

    *C:*

    The Reflection API in C -- ha! Just kidding. Had you for a second, though.

    As for the Hash thing... It's been so long since I've coded C, I just
    don't know... what would the replacement for a Hash be? (C++, I'd
    imagine, has some STL map class.) Named parameters just don't get used
    -- ever -- leading to potentially cryptic code.

    The up/down/left/right thing would most likely be accomplished using an
    enum. This isn't too bad, but it's anti-Ruby. Why? Because it's
    contractual. It requires a static and unchanging list of enum values to
    be declared before an enum can be used. Ruby's free-flowing -- the above
    #move method would not blink an eye if passed :sideways or
    :eek:ut_of_the_way or :to_the_beat as a parameter.

    h2. A list of some important implementation details behind symbols.

    Yeah, so there are some things you might want to know about Ruby's
    symbols before you go applying them willy-nilly. In no particular order:

    * Symbols are immutable. Ha! Actually, you should already know that,
    by virtue of the fact that Symbol's method list doesn't have any
    mutating methods on it. (Compare String#replace and String#gsub!, for
    example.)
    * Symbols are immediate. They share this property with Fixnums, true,
    false, and nil. In Ruby terms:
    3.times { puts :streisand.object_id } #=> 6625550, 6625550, 6625550
    3.times { puts "yogi bera".object_id } #=> 23531092, 23531068,
    23531044
    puts :streisand.object_id #=> 6625550
    See? Each time you reference a String literal, you're creating a new
    one, while each time you reference a Symbol (or any other immediate
    object), you're referring to the same one that was created the *first*
    time you referenced it.
    * You can reference symbols in a couple of other ways. If you want
    more than just variable name syntax for your symbols, you can reference
    a symbol using :'single quotes' or :"double quotes" as such.
    * You can also get access to Symbols _dynamically_, too. As an
    extension of the last bullet, you can actually /interpolate/ the
    double-quoted symbols in the :"normal #{fashion}". You can also get a
    reference to a Symbol from its given String representation using
    String#intern or String#to_sym. These should both be used with strong
    caution because...
    * Symbols are never garbage collected. For most cases, this isn't a
    problem. You'll have maybe a hundred or so tiny little symbols floating
    around in memory (thanks to their immediacy), and getting touched quite
    often. However, if you're pulling Symbols out of your hat dynamically,
    then you're juggling gas-torched batons. This, for example, leaks a
    thousand symbols:
    1000.times {|i| :"number #{i}" }
    * At runtime, you can see a list of all the Symbols that have been
    sprung into existence, by typing Symbol.all_symbols (returns an Array of
    Symbols).
    * :bananorama.to_yaml produces a different result from
    'bananorama'.to_yaml.

    h2. When you might *not* want to use Symbols.

    As pointed out earlier, the principal benefit of using Symbols over
    Strings is to give your mind and eyes a little less work to do in
    figuring out the intent of a given piece of code. Likewise, if what
    you're really doing is preparing a message for the user, or doing
    something else String-like, maybe you want to stick with Strings.

    Bad use of symbols:
    num = [:eek:ne, :two][rand(2)]
    puts "Your number is: #{num}"

    Better to use Strings, instead:
    num = ['one', 'two'][rand(2)] #or %w{one two} if you'd like
    puts "Your number is: #{num}"

    h2. The gory details of their implementation.

    For now, don your flame-retardant suit, and visit <a
    href="http://ruby-talk.org/cgi-bin/vframe.rb/ruby/ruby-talk/172818?172638-173519+split-mode-vertical">this
    thread</a>. I'm too lazy/incompetent to type up a summary of this wizardry.

    h2. Links to other explanations.

    If none of my descriptions helped, well, then, too bad. Or, you can
    click some links.

    The following explanations are not necessarily universally condoned by
    the Ruby community, but may fit your fancy (for what it's worth):

    "Symbols as light-weight
    Strings":http://moonbase.rydia.net/mental/blog/programming/ruby-symbols-explained.html
    "Symbols as Integers with human faces":[ruby-talk:173442]
    "Symbols as ever-present bubbles floating in an imperceptible
    ether":[ruby-talk:173076]

    Devin
     
    Devin Mullins, Dec 21, 2006
    #2
    1. Advertising

  3. matt

    matt Guest

    Excellent work.

    There was reference to this being an HTML document, is there an online
    version of this that I can reference?

    Two questions came out of this:

    1) For rails apps that use link_to :blah

    where is :blah being made a symbol? Is it in the base controller?

    How do I know that I need to use :blah, and not some other symbol ?

    (There are other methods that do this, I'm arbitrarily choosing link_to,
    it could be paginate, or link_to_remote, or many others, the concept I
    hope still remains)

    2) Is there any relation of a Ruby Symbol and a C++ pointer or
    reference ? It sounded like that to me as I was reading through, but I
    could be wrong.


    Thanks

    Matt



    On Thu, 2006-12-21 at 14:24 +0900, Devin Mullins wrote:
     
    matt, Dec 21, 2006
    #3
  4. matt wrote:
    > 1) For rails apps that use link_to :blah
    >
    > where is :blah being made a symbol? Is it in the base controller?

    :blah is made a symbol the first time that the interpreter comes across
    the :blah token, either in your source code or in the Rails source code.
    Each subsequent time that the interpreter finds :blah, it points it to
    the preexisting symbol.

    > How do I know that I need to use :blah, and not some other symbol ?

    Convention. You're just passing a Hash, and Rails cares about certain
    key/value pairs in that Hash.

    > 2) Is there any relation of a Ruby Symbol and a C++ pointer or
    > reference ? It sounded like that to me as I was reading through, but I
    > could be wrong.

    Not really. All variables in Ruby are references.

    HTH,
    Devin
     
    Devin Mullins, Dec 21, 2006
    #4
  5. matt

    Jeremy Wells Guest

    Devin Mullins wrote:
    > matt wrote:
    >
    >> How do I know that I need to use :blah, and not some other symbol ?

    > Convention. You're just passing a Hash, and Rails cares about certain
    > key/value pairs in that Hash.

    I think that rails basically converts all hash keys into symbols
    automatically using Hash.symbolize_keys!, so you could pass a string or
    a symbol, but passing a symbol would be faster.
     
    Jeremy Wells, Dec 21, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben

    Strings, Strings and Damned Strings

    Ben, Jun 22, 2006, in forum: C Programming
    Replies:
    14
    Views:
    814
    Malcolm
    Jun 24, 2006
  2. Daniel Berger
    Replies:
    2
    Views:
    457
    Park Heesob
    Nov 28, 2003
  3. Joe Van Dyk

    symbols vs strings vs ?

    Joe Van Dyk, Feb 3, 2005, in forum: Ruby
    Replies:
    2
    Views:
    100
    Joe Van Dyk
    Feb 3, 2005
  4. Mark Volkmann

    symbols vs. strings

    Mark Volkmann, Aug 26, 2005, in forum: Ruby
    Replies:
    0
    Views:
    104
    Mark Volkmann
    Aug 26, 2005
  5. Wes Gamble
    Replies:
    8
    Views:
    178
    Sean O'Halpin
    Jul 23, 2006
Loading...

Share This Page