Immediate values

  • Thread starter Eustaquio Rangel de Oliveira Jr.
  • Start date
E

Eustaquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

I heard that immediate values holds the object, not a reference to it, is
that right?
I mean:

s1 = "test" # a String located on for ex 0xCC53D5DF
s2 = s1 # points to the same place as s1
s3 = "test" # ANOTHER String, locate on for ex 0xC0DD54D0
n1 = 1 # Fixnum here, located on ... ?
n2 = n1 # points to the same place as n1
n3 = 1 # points to the same place as n1

So, Fixnum (as true, false and nil) objects uses the same object for all
over the program, but Strings, for ex, does not, even if the value are the
same there are distinct objects, right?

On the end, n1, n2 and n3 are not reference to this only one object
allocated there, shared by all? Variables are not all references, even on
the Fixnum case, pointing to an allocated single object there?

And I think the mark-and-sweep garbage collector works on the same way as
other objects to Fixnum, true, false and nil right?

Thanks! :)

- ----------------------------
Eustáquio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usuário GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFB4lJNb6UiZnhJiLsRAqS1AJ9PG+dTod2hRCEJAo71ciqIY+KiMQCdGbxU
yj5Dg/5NfkpW+Qnislw9Nz4=
=DRR4
-----END PGP SIGNATURE-----
 
A

Austin Ziegler

I heard that immediate values holds the object, not a reference to
it, is that right?

I mean:

s1 = "test" # a String located on for ex 0xCC53D5DF
s2 = s1 # points to the same place as s1
s3 = "test" # ANOTHER String, locate on for ex 0xC0DD54D0
n1 = 1 # Fixnum here, located on ... ?
n2 = n1 # points to the same place as n1
n3 = 1 # points to the same place as n1

So, Fixnum (as true, false and nil) objects uses the same object
for all over the program, but Strings, for ex, does not, even if
the value are the same there are distinct objects, right?

On the end, n1, n2 and n3 are not reference to this only one
object allocated there, shared by all? Variables are not all
references, even on the Fixnum case, pointing to an allocated
single object there?

Up until this paragraph I was with you and completely agree.
Variables are all references, even to Fixnum and Symbol objects.
It's an implementation detail that all Fixnum and Symbol object
instances s are references to the same object, IMO.

-austin
 
K

Kaspar Schiess

Hello,

I have already answered to this on the irc channel, but will try to
collect the answers given as an answer for the history:

Ruby variables hold references to objects. Objects are allocated on the
heap, so a variable saves a pointer to the heap.

n1 = "test"
n2 = n1

# n1 and n2 point to the _same_ object.
n1.object_id == n2.object_id # > true

So you do have to take care not to modify n1, otherwise you will be
modifying n2 too. This is actually less often a problem than one would
think.

# look here
n1 += 'test' # creates new object and leaves n2 alone

# of course
n1.sub! /est/, 'ry' # will modify n1 and n2

Look also here: http://www.rubycentral.com/book/tut_classes.html,
starting at the title 'Variables'.

Of course that is what you need to think about objects, rather than what
it looks like behind the facade. If you are interested in the underlying
reality I guess you need to look at some code.

best regards,
kaspar
 
R

Robert Klemme

Austin Ziegler said:
Up until this paragraph I was with you and completely agree.
Variables are all references, even to Fixnum and Symbol objects.
It's an implementation detail that all Fixnum and Symbol object
instances s are references to the same object, IMO.

Totally agree! And it doesn't make a difference from the usage point of
view since instances of Symbol, Fixnum, TrueClass, FalseClass, NilClass
are immutable.

Addenum: it's especially noteworthy that strings literals are treated
differently from true, false, Fixnums and Symbols. They create a new
string instance on each evaluation.
5.times { p ["foo", 'foo', :foo, 1, 1.2, 10000000000000, true, false,
nil].map {|o| o.id} }
[134663704, 134663692, 3938574, 3, 134665492, 134665228, 2, 0, 4]
[134661376, 134661316, 3938574, 3, 134665492, 134665228, 2, 0, 4]
[134660428, 134660392, 3938574, 3, 134665492, 134665228, 2, 0, 4]
[134659348, 134659336, 3938574, 3, 134665492, 134665228, 2, 0, 4]
[134656588, 134656372, 3938574, 3, 134665492, 134665228, 2, 0, 4]
=> 5

Kind regards

robert
 
F

Florian Gross

Robert said:
Totally agree! And it doesn't make a difference from the usage point of
view since instances of Symbol, Fixnum, TrueClass, FalseClass, NilClass
are immutable.

Note that there is a small difference for Symbols and Fixnums. You can
not define singleton methods on those objects. Maybe external singleton
classes that work like exivars would help with that.
Addenum: it's especially noteworthy that strings literals are treated
differently from true, false, Fixnums and Symbols. They create a new
string instance on each evaluation.

But note that literals with the same content will actually share the
String buffer.
 
R

Robert Klemme

Florian Gross said:
Note that there is a small difference for Symbols and Fixnums. You can
not define singleton methods on those objects.
Yep.

Maybe external singleton
classes that work like exivars would help with that.

Uh, oh... What's an "exivar"?
But note that literals with the same content will actually share the
String buffer.

True (yet another optimization). But only as long as they are not
modified. It's about the same behavior as #dup was called on some string
held behind the scenes. Still another object instance has to be allocated
which is why in performance critical parts you're usually better off
defining a frozen string constant if the string doesn't change anyway.

class Dummy
SAMPLE = "foo".freeze

def call_often(str)
str.include? SAMPLE
end
end

Kind regards

robert
 
B

Bertram Scharpf

Hi,

Am Montag, 10. Jan 2005, 21:09:03 +0900 schrieb Kaspar Schiess:
Ruby variables hold references to objects.

Stricly spoken, Fixnums don't. They are treated a special
way.
If you are interested in the underlying
reality I guess you need to look at some code.

And here it comes (paste to `irb'):

max = 2**30
a = Array.new 5_000 do rand max end ; nil
a.all? { |i| i == i.object_id >> 1 }
a.any? { |i| (i.object_id & 0x1).zero? }

Bertram
 
E

Eustaquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

| "mmediate values are not pointers: Fixnum, Symbol, true, false, and nil
| are stored directly in VALUE. Fixnum values are stored as 31-bit numbers[Or
| 63-bit on wider CPU architectures.] that are formed by shifting the
| original number left 1 bit and then setting the least significant bit (bit
| 0) to ``1.'' When VALUE is used as a pointer to a specific Ruby structure,
| it is guaranteed always to have an LSB of zero; the other immediate values
| also have LSBs of zero. Thus, a simple bit test can tell you whether or not
| you have a Fixnum."

Forgive me if I misunderstood, but so VALUEs are variables?

a = 1

a is the VALUE, with 32 bit length?

So, to find the object_id of *all* the Fixnum, all I have to do is
something like:

[taq@~]irb
irb(main):001:0> i = 3
=> 3
irb(main):002:0> s = "0b" << i.to_s(2) << "1"
=> "0b111"
irb(main):003:0> s.to_i(2)
=> 7

The object_id of 3 always will be 7 right? But i does not point to
somewhere on memory there? i is the own object?
And when I have a bit 1 there on LSB is *always* a Fixnum?

| Naturally, it ignores immutable objects. You can create millions of Fixnum
| objects, yet they take no storage. Not so for String or Bignum objects:

Ok, but how it does that? I mean, I have 31 (32?) bit numbers there on my
one thousand vars on the local scope. Where they are stored? They are
somewhere on the memory, right? How it can take no storage? Automatically
killed after it scopes ends? But before this, where they are? :)

Sorry if I'm asking a lot about that, but immediate values made me curious
about it. :)

Regards,

- ----------------------------
Eustáquio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usuário GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFB4qQkb6UiZnhJiLsRAvBsAJ9NdntqBt0ZoPKAyrfnpr2J8aQRJwCfS4DX
KO3f8rtIznKnp5pICtOc/+U=
=vBrT
-----END PGP SIGNATURE-----
 
E

Eustaquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Btw, I'm trying to understand how these things works to see if it avoids
this

http://evanjones.ca/python-memory.html

kind of things. Maybe something on topic right now also. ;-)

- ----------------------------
Eustáquio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usuário GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFB4qa/b6UiZnhJiLsRAik7AKCdcRhL6q2jgeu2YHc30haonwd1awCgr/7b
aRNxWaC9MACCcfgYSUUTcqk=
=X3Fl
-----END PGP SIGNATURE-----
 
F

Florian Gross

Eustaquio said:
Forgive me if I misunderstood, but so VALUEs are variables?

a = 1

a is the VALUE, with 32 bit length?

a is mapped to a VALUE. VALUEs are pointers to Ruby Object Structs.
Immediate objects are not represented by any Ruby Object Struct -- they
are identified directly by the pointer's destination. If Ruby sees such
a special pointer it does not need to resolve it.

Because there is no actual Ruby Object Struct there are no flags (which
means you can't (un)taint them), no instance variables and no singleton
classes for immediate values. (false, true and nil act as if FalseClass,
TrueClass and NilClass were their meta classes.)

Floats and Bignums are not actually immediate values, but they are
immutable and act as if they were immediate by default in some other
contexts. (You can't define singleton methods on them because they use a
custom singleton_method_added callback that raises an exception. It's
implemented by Numeric#singleton_method_added.)

false, nil and true and the special undef value that is not visible
anywhere in Ruby are represented by VALUEs of 0, 2, 4 and 6.

Fixnums are represented by VALUEs with bit 0 set to 1. This means that
there can be up to 2147483648 possible values (-1073741824 ..
1073741823) on 32 bit and up to 9223372036854775808 possible values
(-4611686018427387904 .. 4611686018427387903) on 64 bit systems.

Symbols are represented by VALUEs with bit 0 to 7 set to 01110000. This
means that there can be up to 16777216 different Symbols on 32 bit and
up to 72057594037927936 different Symbols on 64 bit systems.

If I'm wrong with any of those please correct me. It would also be
interesting if somebody could find out what values VALUEs that are not
immediate Objects can have.
 
F

Florian Gross

itsme213 said:
You can try to work the angle that value-like things (Fixnums) are
fundamentally different from objects, but I think there is a simpler
explanation. Along the general lines some others have said in this thread,
but with some differences.

Fixnums are Objects, even if they are not represented by an actual
Object structure in Ruby. You can do method calls on them, access their
instance variables etc.
A variable always 'holds' ('contains', 'its value is') a _reference_ to an
object. This is true for local variables, instance variables, encoded
instance variables (see below), integer-indexed variables array members,
key-indexed variabels like hash entries, global variables. In fact, it is
even true for Constants; I like to think of all of these as 'slots'; some
slots may be 'frozen', ... i.e. they can never refer to a different object
(see below).

I'd just say that Objects are in most cases represented by VALUEs. I'm
not sure what you mean with "encoded instance variables".
Most references are encoded as in-memory pointers to heap-allocated storage
corresponding to that object. That heap allocated storage, in turn, contains
a its own 'variables' (called instance variables), which also always 'hold'
references to objects. Methods on these objects can refer to their instance
variables (and to methd args, and to globals and constants).

It does not only contain instance variables. There's also information
like klass, flags, string buffer pointers and more in the internal
Structs. (And not all Objects store their instance variables in those
internal Structs, see exivars.)
Fixnums are encoded as 2's complement bit strings (for example).
Thus, '0000' is an encoded _reference_ to the fixnum '0'.
And '0001' is an encoded _reference_ to the fixnum '1'.

I don't quite get this. How is '0000' different from '0' and '0001' from
'1'? If you're talking about object ids you should have used object_id
= 1 + (fixnum << 1). (just prepend a '1' in binary representation.)
Such encodings also encode a set of 'instance variables' that those objects
would have explicitly stored had they been represented on the heap like
normal objects. For example, the encoding of Fixnum '1' implicitly encodes
references to its adjacent fixnums '0' and '2'. The number '0' existed well
before the bit pattern '0001' ever appeared in your Ruby program.

I'm not sure if you're using an analogy there, but Fixnums certainly
don't store their neighbors at all. It's a well-known fact that 2+1 ==
3, you don't need to store that in an instance variable.
Of course, methods on such objects need access to these 'instance
variables', so the methods are also optimized so they can simply work with
the encoded reference (rather than with the object itself), and again
guarantee to return (possibly encoded) references to objects.

Methods take Objects and return Objects. They don't care if that Object
is encoded in a VALUE or actually backed up by an actual Object struct.
The discussion of Fixnums are 'immutable' is right in spirit, but in detail
bears closer inspection. Certain 'encoded' instance variables of Fixnums are
frozen. Specifically, their relationships to all other fixnums dictated by
the laws of math. i.e. these encoded instance variables are automaticaly
'frozen' (there was a long earlier thread on this topic). After all, you
would not want 2+1 to change from 3 to 73 in the middle of your program.

Again, not sure if this is supposed to be an analogy or simplistic view
of things, but the result of 2+1 is certainly not dictated by instance
variables and there are no 'special frozen instance variables' in Ruby.
And 2+1 can be changed from 3 to 73. After all 2+1 is just calling the
plus method on 2 with the argument 1. And methods can be changed:

class Fixnum
alias :eek:ld_plus :+

def +(other)
return 73 if self == 2 and other == 1

old_plus(other)
end
end
However, Fixnums can certainly have other mutable 'instance variables'; we
just have to handle the internal implementation differently because the
objects themselves are not heap allocated, so we need some other means to
get to these 'inst-vars'.
class Fixnum
@@foos = {}
def foo; return @@foos[self]; end
def foo=(x); @@foos[self]=x; end
end

This are no instance variables, this are class variables.

Why don't just use instance variables directly? (After all Fixnums can
have instance variables. They are not stored in the Object struct of
Fixnums, of course, as Fixnums have no Object structs. So where do they
go? There's a global exivar table for Objects that can not store their
instance variables in Object structs. They go there. You don't notice
this, of course, and that's a good thing.)
 
R

Robert Klemme

itsme213 said:
You can try to work the angle that value-like things (Fixnums) are
fundamentally different from objects, but I think there is a simpler
explanation. Along the general lines some others have said in this thread,
but with some differences.

A variable always 'holds' ('contains', 'its value is') a _reference_ to an
object. This is true for local variables, instance variables, encoded
instance variables (see below), integer-indexed variables array members,
key-indexed variabels like hash entries, global variables. In fact, it is
even true for Constants; I like to think of all of these as 'slots'; some
slots may be 'frozen', ... i.e. they can never refer to a different object
(see below).

Although I agree with the rest of your detailed explanation I beg to
differ on this one: "they" refers to variables and constants and *every*
variable and constant can be made to point to another object:
(irb):2: warning: already initialized constant Foo
=> 2=> 2

You probably wanted to say that for some classes there is just one
instance of a specific value:
=> 3

So the fixnum representing the number 1 is always represented by the same
instance. (As you point out below, technically it's more complicated
because the "instance" is different from other classes instances, but from
the perspective of the user this doesn't matter as long as he does not try
to define methods for 1.)
Most references are encoded as in-memory pointers to heap-allocated storage
corresponding to that object. That heap allocated storage, in turn, contains
a its own 'variables' (called instance variables), which also always 'hold'
references to objects. Methods on these objects can refer to their instance
variables (and to methd args, and to globals and constants).

Some references are optimized and encoded differently.

Fixnums are encoded as 2's complement bit strings (for example).
Thus, '0000' is an encoded _reference_ to the fixnum '0'.
And '0001' is an encoded _reference_ to the fixnum '1'.
Such encodings also encode a set of 'instance variables' that those objects
would have explicitly stored had they been represented on the heap like
normal objects. For example, the encoding of Fixnum '1' implicitly encodes
references to its adjacent fixnums '0' and '2'. The number '0' existed well
before the bit pattern '0001' ever appeared in your Ruby program.

Of course, methods on such objects need access to these 'instance
variables', so the methods are also optimized so they can simply work with
the encoded reference (rather than with the object itself), and again
guarantee to return (possibly encoded) references to objects.

Fixnum# + (other_fixnum)
return a (encoded, optimized) reference to another fixnum
e.g. '0001' + '0001' => '0010'

The discussion of Fixnums are 'immutable' is right in spirit, but in detail
bears closer inspection. Certain 'encoded' instance variables of Fixnums are
frozen. Specifically, their relationships to all other fixnums dictated by
the laws of math. i.e. these encoded instance variables are automaticaly
'frozen' (there was a long earlier thread on this topic). After all, you
would not want 2+1 to change from 3 to 73 in the middle of your program.

However, Fixnums can certainly have other mutable 'instance variables'; we
just have to handle the internal implementation differently because the
objects themselves are not heap allocated, so we need some other means to
get to these 'inst-vars'.
class Fixnum
@@foos = {}
def foo; return @@foos[self]; end
def foo=(x); @@foos[self]=x; end
end

irb(main):028:0> 1
=> 1
irb(main):029:0> 1.foo
=> nil
irb(main):030:0> 1.foo= 5
=> 5
irb(main):031:0> 1.foo
=> 5

Whatver works for you ...

Although this works I'd rather not have this kind of instance variables
for fixnums because for me fixnums are atoms (i.e. building blocks of the
language) without internal structure (ok, you may replace "atom" with
"quark"...). Also there is the issue with garbage collection of these
"instance variables"...

Kind regards

robert
 
T

ts

F> Symbols are represented by VALUEs with bit 0 to 7 set to 01110000. This
F> means that there can be up to 16777216 different Symbols on 32 bit and

2**23 - 1

F> up to 72057594037927936 different Symbols on 64 bit systems.



Guy Decoux
 
F

Florian Gross

ts said:
F> Symbols are represented by VALUEs with bit 0 to 7 set to 01110000. This
F> means that there can be up to 16777216 different Symbols on 32 bit and

2**23 - 1

Hm, why not 2 ** 24 - 1? (Is one bit used for a special purpose or am I
overlooking something?)

Thanks.
 
T

ts

F> Hm, why not 2 ** 24 - 1? (Is one bit used for a special purpose or am I
F> overlooking something?)

It must make the transformation ID ==> Symbol, Symbol ==> ID and if I'm
right it lost a bit ("sign" bit)

This give one free bit, if one day you want to hack ruby :)


Guy Decoux
 
R

Ruth A. Kramer

Florian said:
I'd just say that Objects are in most cases represented by VALUEs. I'm
not sure what you mean with "encoded instance variables".

(Not "picking" on Florian in particular, just needed a convenient quote
to respond to.)

This thread confuses me, and I think it's at least partly by
"overloaded" use of the word "value". ;-)

For me, I'd like to separate address and value to the extent possible.

In my (surely incorrect) words (and based on my understanding of Ruby),
a variable representing an object holds the memory address of that
object. The value of that object is something different, for example,
if the object is the number 2, the value of that object is its numeric
value, 2.

Yes, you can say that the value of the variable holding the reference to
the object is its memory address, but, to me, that just increases the
(potential for) confusion.

Is there an agreed upon vocabulary for these things in Ruby? If so,
what is it, if not, should there be?

Randy Kramer
 
F

Florian Gross

Ruth said:
(Not "picking" on Florian in particular, just needed a convenient quote
to respond to.)

This thread confuses me, and I think it's at least partly by
"overloaded" use of the word "value". ;-)

Oh, note that me talking about VALUE was just the way Object's are
presented. VALUEs are a frequently-used way to refer to Objects. It's a
C type that Ruby uses internally. It's usually just a pointer to some
object data in the form of a RBasic-compatible struct. (Or a magic
number in the case of immediate objects.)
For me, I'd like to separate address and value to the extent possible.

I guess one could say that addresses are represented by VALUEs.

I'm not sure of the relationship between values and VALUEs. Maybe you
could say that the value of a variable is indicated by a VALUE. But all
this VALUE stuff does not matter anyway when you're coding in Ruby. I
think it can safely be ignored until you write a C extension or decide
to hack at Ruby's internals via Ruby/DL. ;)

I hope I did not confuse anybody, even if this was not directed to me in
particular. :)
 
F

Florian Gross

ts said:
F> Hm, why not 2 ** 24 - 1? (Is one bit used for a special purpose or am I
F> overlooking something?)

It must make the transformation ID ==> Symbol, Symbol ==> ID and if I'm
right it lost a bit ("sign" bit)

This give one free bit, if one day you want to hack ruby :)

Hmmm, maybe this is can one day be used for representing Characters in
an efficient way. 0 .. 8388607 might not be full Unicode range, but it
would be a nifty optimization for the UCS2 space at least.

Now I'm only wondering if the number of Symbols expands to
36028797018963967 on 64 bit systems or if it is fixed independently of
native bit width.

Thanks for your correction and explanation!
 
W

why the lucky stiff

Ruth said:
(Not "picking" on Florian in particular, just needed a convenient quote
to respond to.)

This thread confuses me, and I think it's at least partly by
"overloaded" use of the word "value". ;-)

Hi, Ruth. So, when someone on the list uses uppercase VALUE, they are
refering to an object's symbol table id. The symbol table being the
global dictionary that pairs up a unique id with pointers to the actual
objects. VALUE is a C typedef.

Sometimes when I'm conversing with other languages, such as in my Syck
extension, I'll rename VALUE as SYMID, in the hopes of clearing things
up a bit.

So, yeah, it's a generic word, but it's also referring to the most
common and generic form of data used inside Ruby.

_why
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top