Too Cool for Internet Explorer

Saturday, February 13, 2010

Ruby Symbols Ultimate Guide

In Ruby, a Symbol is a special class used to define a constant named label.

A symbol is defined using a colon ":" in the beginning.

Example:
:my_test_symbol

A symbol is not a string, but it has a string representation and an object identifier.

Ruby newbies ask about advantages on using constants over variables, or symbols over both, very often.

First of all, you must know there are no really Constants in Ruby. They are just a convention on variable names (starting with uppercase letters).

You can in fact, change the "constant" values along your ruby program.

That way, it is your responsibility to keep your Ruby "constants" with the same value almost all the time.

puts '==========='
FIXED = 'MROS'
puts FIXED
puts FIXED.object_id #=> 25986580
puts '==========='
# after a while you may have a more flexible constant
FIXED = 'MarcRic'
# Ruby only show you a warning: already initialized constant FIXED
puts FIXED
puts FIXED.object_id #=> 25986550
puts '==========='
FIXED = 'MROS'
puts FIXED
puts FIXED.object_id #=> 25986510
puts '==========='
view raw symbols01.rb hosted with ❤ by GitHub


Strings, even with the same content are different objects, so, are identified differently.

puts '==========='
# Same string content - different objects
puts "ZigZag".object_id #=> 25986960
puts '==========='
puts "ZigZag".object_id #=> 25986940
puts '==========='
view raw symbols02.rb hosted with ❤ by GitHub


So, strictly technically speaking, Symbols are pointers to memory objects containing the symbol name.

Differently from Ruby variables or even Ruby "constants", Symbol values could not be changed.

:mysymbol = 'ZigZag'
# symbols03.rb:1: syntax error, unexpected '=', expecting $end :mysymbol = 'ZigZag'
# ^
view raw symbols03.rb hosted with ❤ by GitHub


In fact you couldn’t even access the Symbol content unless through a Symbol#to_s method.

puts '==========='
puts :ZigZag.object_id #=> 199098
puts :ZigZag.object_id #=> 199098
puts '==========='
puts "ZigZag".to_sym.object_id #=> 199098
puts "ZigZag".to_sym.object_id #=> 199098
puts '==========='
puts :ZigZag == :ZigZag
# We get this true due to just one comparision
puts '==========='
puts "ZigZag" == "ZigZag"
# We get this true due to six comparisions
puts '==========='
puts :ZigZag.to_s
puts '==========='
view raw symbols04.rb hosted with ❤ by GitHub


Remarkable points here are:
  • When Symbols are better then Strings?
  • Which are the advantages and drawbacks from one over the other?

If you have a few unique string values, and will use them the way they are (no concatenation, upper or lower case, no nothing), Symbols are the right tool for the job.

In any other circumstance, stick with strings. They are not that much slower then Symbols, and much more flexible.

Symbol scope is a bit different and need to be considered.

:tynysymbol
$global_var = 'Zero Level'
puts :tynysymbol.object_id
class TestFirstClass
@@class_var1 = 'First Level Class'
def testFirstMethod
@method_var1 = 'First Level Method'
puts $global_var
puts @@class_var1
puts @method_var1
puts :firstlocalsymbol.to_s
puts :tynysymbol.to_s
puts :tynysymbol.object_id
end
end
puts '==========='
f = TestFirstClass.new
f.testFirstMethod
puts '==========='
puts $global_var
puts :tynysymbol.to_s
puts :tynysymbol.object_id
puts '==========='
puts :firstlocalsymbol.to_s
puts :secondlocalsymbol.to_s
puts '==========='
puts Symbol.all_symbols.inspect
puts @method_var1
puts '==========='
TestFirstClass.new.testFirstMethod
puts '==========='
view raw symbols05.rb hosted with ❤ by GitHub


About performance, you need to have a few points in mind when dealing with a great amount of strings and Symbols:

About memory:

Strings are Garbage collected, Symbols are not.

So, Symbols will be there while your program is running, which can be a great memory consumer.

Have you ever considered how many objects you will have when you set a variable inside a loop?

s = ['eeny','meeny','miny','moe']
puts '==========='
s.each do |v|
myvar = v
puts myvar
puts myvar.object_id
end
puts '==========='
s.each do |v|
myvar = 'Constant'
puts myvar
puts myvar.object_id
end
puts '==========='
s.each do |v|
myvar = :Symbol
puts myvar.to_s
puts myvar.object_id
end
puts '==========='
view raw symbols06.rb hosted with ❤ by GitHub


When creating a great amount of Strings and Symbol, how did the operation compares in performance?

require 'benchmark'
string_test = Benchmark.measure do
6_000_000.times do
"test"
end
end.total
symbol_test = Benchmark.measure do
6_000_000.times do
:test
end
end.total
puts "String: " + string_test.to_s
puts "Symbol: " + symbol_test.to_s
puts '==========='
view raw symbols07.rb hosted with ❤ by GitHub


About CPU Time:

When you make Strings comparison rather then Symbols comparison, it could be a great CPU time consumer, since the comparison must be done for each character in the string.

require 'benchmark'
string_test = Benchmark.measure do
3_000_000.times do
"ZigZag" == "ZigZag"
end
end.total
sym = Benchmark.measure do
3_000_000.times do
:ZigZag == :ZigZag
end
end.total
puts "String: " + string_test.to_s
puts "Symbol: " + sym.to_s
puts '==========='
view raw symbols08.rb hosted with ❤ by GitHub


Symbol "values" can be retrieved in many different ways:

puts '==========='
class TestSymbol
def test_method
puts :My_Name.to_s
puts '==========='
end
end
o1 = Object.const_get(:TestSymbol).new
o1.test_method
o2 = eval(:TestSymbol.to_s).new
o2.test_method
o3 = Kernel.const_get(:TestSymbol).new
o3.test_method
view raw symbols09.rb hosted with ❤ by GitHub


Usage:

Symbols should be used whenever referring to a name (identifier or keyword), even if that name doesn’t exist in actual code yet.

  • Naming keyword options in a method argument list
  • Naming enumerated values.
  • Naming options in an option hash table.

puts '==========='
def sample_method ( arg1 = :arg1, argN = :argN )
puts arg1
puts argN
end
sample_method()
puts '--------------------'
sample_method('giving arg1', 'giving argN')
puts '==========='
state_abbrev = {}
state_abbrev[:ALABAMA] = 'AL'
state_abbrev[:CALIFORNIA] = 'CA'
state_abbrev[:DELAWARE] = 'DE'
state_abbrev[:FLORIDA] = 'FL'
state_abbrev[:KANSAS] = 'KS'
puts state_abbrev[:DELAWARE]
puts '--------------------'
puts state_abbrev.index('CA')
puts '==========='
weekend = {:sat => 'Saturday', :sun => 'Sunday'}
puts weekend[:sat]
puts '--------------------'
puts weekend.keys
puts '--------------------'
puts weekend.values
puts '==========='
view raw symbols10.rb hosted with ❤ by GitHub


That is it about Ruby Symbols