How and where Fixnum are created

  • Thread starter Eustaquio Rangel de Oliveira Jr.
  • Start date
E

Eustaquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi.

I'm wondering here how and where Fixnum are created.
I saw that ObjectSpace does not returns immediate values, ok, I was curio=
us
about why having

n1, n2, n3, n4 =3D 1, 2, 3, 1

was returning nothing there on ObjectSpace.each_object(Fixnum).
I know that Fixnum are immediate values, and "they are assigned or passed
as parameters, the actual object is passed, rather than a reference to th=
at
object.".

But reading about "There is effectively only one Fixnum object instance
for any given integer value, so, for example, you cannot add a singleton
method to a Fixnum.", my doubt is if "the actual object" and "only one
Fixnum object instance" referenced above is:

1 - A object created for each Fixnum (like o1 for n1 AND n4, o2 for n2 an=
d
o3 for n3). But if so, they does not appears on ObjectSpace listing, and
what is the behaviour on gc?

n1(1) -------------+---> "hidden" fixnum object (object id 3)
n4(1) -------------+
n2(2) -----------------> another "hidden" fixnum object (object id 5)
n3(3) -----------------> another "hidden" fixnum object (object id 7)

2 - Or a single "internal" Fixnum object which is created ONCE and where =
is
the VMT, who is asked for it's method everytime it's needed. So on this
case n1, n2, n3 and n4 just stores the -(2**30) till (2**30)-1 number and
asks the "internal" Fixnum object for methods to use with the value.

n1(1) -------------+----> "internal" fixnum (check vmt)
n2(1) -------------+
n3(1) -------------+
n4(1) -------------+

Or any (or some kind of mix of) the two options above?

Thanks.

- ----------------------------
Eust=E1quio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usu=E1rio GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFCxaxib6UiZnhJiLsRAnOFAKCoLs97HNPnwH6BT1yMTmRrF718PwCfbRyo
NoInDSM789ihxj6o3aM08+g=3D
=3DzY9W
-----END PGP SIGNATURE-----
 
R

Robert Klemme

Eustaquio Rangel de Oliveira Jr. said:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi.

I'm wondering here how and where Fixnum are created.
I saw that ObjectSpace does not returns immediate values, ok, I was
curious about why having

n1, n2, n3, n4 = 1, 2, 3, 1

was returning nothing there on ObjectSpace.each_object(Fixnum).
I know that Fixnum are immediate values, and "they are assigned or
passed as parameters, the actual object is passed, rather than a
reference to that object.".

But reading about "There is effectively only one Fixnum object
instance for any given integer value, so, for example, you cannot add
a singleton method to a Fixnum.", my doubt is if "the actual object"
and "only one Fixnum object instance" referenced above is:

1 - A object created for each Fixnum (like o1 for n1 AND n4, o2 for
n2 and o3 for n3). But if so, they does not appears on ObjectSpace
listing, and what is the behaviour on gc?

n1(1) -------------+---> "hidden" fixnum object (object id 3)
n4(1) -------------+
n2(2) -----------------> another "hidden" fixnum object (object id 5)
n3(3) -----------------> another "hidden" fixnum object (object id 7)

2 - Or a single "internal" Fixnum object which is created ONCE and
where is the VMT, who is asked for it's method everytime it's needed.
So on this case n1, n2, n3 and n4 just stores the -(2**30) till
(2**30)-1 number and asks the "internal" Fixnum object for methods to
use with the value.
n1(1) -------------+----> "internal" fixnum (check vmt)
n2(1) -------------+
n3(1) -------------+
n4(1) -------------+

Or any (or some kind of mix of) the two options above?

No, it's a third option: Fixnums are not really objects like others. With
Fixnum the object reference is the object itself. But you don't see that
directly because they are made to mostly work like other objects. You can
only observe it indirectly (like you did) in some places, namely

- absence from ObjectSpace
- prohibition of singleton methods

and maybe some more I don't recall because of my sleepyness. From a
pragmatic perspective this all doesn't really matter: since Fixnums are
immutable (no state changes, no singleton methods) anyway it doesn't matter
whether you have one instance representing a certain numeric value or
multiple. What in fact does matter in practice is that the implementation
of Fixnums in Ruby provides better performance than Fixnums that are
ordinary objects; in that case all numeric calculations would suffer because
they would incur the additional overhead of object creation and destruction.
HTH

Kind regards

robert
 
E

Eustaquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey Robert!

|> Or any (or some kind of mix of) the two options above?

| No, it's a third option: Fixnums are not really objects like others.
| With Fixnum the object reference is the object itself.

I was checking numeric.c and found

rb_cFixnum =3D rb_define_class("Fixnum", rb_cInteger);

and method stuff like

rb_define_method(rb_cFixnum, "to_s", fix_to_s, -1);

and others. So, there is some method definition there, but so the object
itself handle the methods and it's value?

| But you don't
| see that directly because they are made to mostly work like other
| objects. You can only observe it indirectly (like you did) in some
| places, namely
|
| - absence from ObjectSpace

Yeah, I needed to open gc. and find

"Immediate objects (Fixnums, Symbols, true, false, and nil) are never
returned."

That gaves me a kind of relief because I was not understanding what was
happening. :)

| and maybe some more I don't recall because of my sleepyness.

Is sleep a rare thing for developers? I'm on that kind of situation also.
Specially because I have a little baby girl who likes to go to sleep late
ehehe. :)

| From a
| pragmatic perspective this all doesn't really matter: since Fixnums are
| immutable (no state changes, no singleton methods) anyway it doesn't
| matter whether you have one instance representing a certain numeric
| value or multiple.

Sure, but is there just one or multiples? :)
Multiples instances will "eat" more memory, on this case?

| What in fact does matter in practice is that the
| implementation of Fixnums in Ruby provides better performance than
| Fixnums that are ordinary objects; in that case all numeric calculation=
s
| would suffer because they would incur the additional overhead of object
| creation and destruction. HTH

Yes, that is really a good thing for sure, the implementation is very goo=
d.
I'm wondering about that because I'll need to explain that kind of thing =
to
some developers that does not know Ruby and I'm sure they will ask me
something about this case (specially the python ones). So I need to try t=
o
resume the behaviour for them.

What is still confusing me is, as you said, is that there is no additiona=
l
overhead of object creating and destruction, so I can presume that we dea=
l
directly with their values, but where are their methods?

Sorry to bore you, but I need a "light" about that (or more sleep and res=
t
to think better). :)

Thanks!

- ----------------------------
Eust=E1quio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usu=E1rio GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFCxbURb6UiZnhJiLsRAjECAJ0VhJxTzg6iPJEo6/dJ1qQ2vVbX4QCgvZHv
ARvvf3e0C3xHZdi27VjoAd0=3D
=3DSccr
-----END PGP SIGNATURE-----
 
D

Daniel Brockman

Eustaquio,

All Ruby objects are aligned on four-byte boundaries in memory, so all
pointers are divisible by four. The rest of the pointer values are
used to store immediate values such as fixnums.

(This is a pretty standard way to store immediate values in dynamic
languages that lack static typing.)

So when you say this,

foo = Object.new

an object is created and its address (which will be divisible by four)
is stored in `foo'. But when you say this,

bar = 42

the integer 42 is doubled, increased by one, and the resulting value
is stored directly in `bar'. That's how Ruby stores fixnums.

Later, when you say this,

foo.moomin

Ruby sees that `foo' is divisible by four and so must be a pointer to
an object, and a regular method call is made. But when you say this,

bar.snufkin

it sees that `bar' contains 85, which is an odd integer --- not
divisible by four. The oddness means that `bar' is an immediate
Fixnum, so Ruby converts the value to its real meaning by subtracting
one and halving it, giving 42. Then the method `snufkin' is looked up
in the Fixnum class and invoked with `self' bound to 42.

Okay, I realize that this explanation turned out pretty confusing.
For more information, you might search for ``ruby immediate values.''

I hope this helps at least a bit,
 
D

Dominik Bathon

Okay, I realize that this explanation turned out pretty confusing.
For more information, you might search for ``ruby immediate values.''

I hope this helps at least a bit,

If you look at ruby.h, you will find the following defines:

#define FIXNUM_MAX (LONG_MAX>>1)
#define FIXNUM_MIN RSHIFT((long)LONG_MIN,1)

#define FIXNUM_FLAG 0x01
#define INT2FIX(i) ((VALUE)(((long)(i))<<1 | FIXNUM_FLAG))
#define LONG2FIX(i) INT2FIX(i)

...

#define FIX2LONG(x) RSHIFT((long)x,1)
#define FIX2ULONG(x) (((unsigned long)(x))>>1)
#define FIXNUM_P(f) (((long)(f))&FIXNUM_FLAG)
#define POSFIXABLE(f) ((f) <=3D FIXNUM_MAX)
#define NEGFIXABLE(f) ((f) >=3D FIXNUM_MIN)
#define FIXABLE(f) (POSFIXABLE(f) && NEGFIXABLE(f))


They should be self-explanatory after Daniel's nice explanation, and =20
should clear it up further.

You can also see how it works if you look at the object_id of Fixnums:
RangeError: 0xffffffaa is not id value
from (irb):5:in `_id2ref'
from (irb):5


Dominik
 
E

Eustáquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

| Eustaquio,

Hey Daniel. :)

| Later, when you say this,
| foo.moomin
| Ruby sees that `foo' is divisible by four and so must be a pointer to
| an object, and a regular method call is made. But when you say this,

A regular method is called on the object at a memory location allocated
for the object value OR for value and a representation of the object
class, with methods and so on?

What is pointed by, let's say, a variable who stores a reference to a
String like "this is a long string where I can write a lot of things" is
just the string value or a complete package with it's value and methods?

What will be released by the garbage collector is the answer of the
question above (memory allocated for values OR for values and methods),
when is not needed anymore, right?

For example on

s1, s2, s3, s4 =3D "this","is","a","test"

s1, s2, s3, s4 shares the String methods and their references points to
memory locations only with "this" "is" "a" "test", querying the String
methods on a String representation

OR

the reference to "this" & object methods, "is" & object methods, "a" &
object methods and "test" & object methods?

So it's like

~ +-------- gc works here ------------+
~ | |
~ +--------+---------+ +---------+ +---------+--------+
~ | memory allocated | | String | | memory allocated |
~ +------------------+ +---------+ +------------------+
~ | value | | methods | | value |
~ +------------------+ +---------+ +------------------+
s1 =3D | "this" +--+ upcase +--+ "is" | =3D s2
~ +------------------+ +---------+ +------------------+

or like

~ +-------- gc works here -------+
~ | |
~ +---------+---------+ +--------+---------+
~ | memory allocated | | memory allocated |
~ +---------+---------+ +--------+---------+
~ | value | methods | | value | methods |
~ +---------+---------+ +--------+---------+
s1 =3D | "this" | upcase | s2 =3D | "is" | upcase |
~ +---------+---------+ +--------+---------+

I think it should be first one, specially when we change the String
class at runtime and it changes all the String references, but I'm
asking just to be sure. :)

| bar.snufkin
| it sees that `bar' contains 85, which is an odd integer --- not
| divisible by four. The oddness means that `bar' is an immediate
| Fixnum, so Ruby converts the value to its real meaning by subtracting
| one and halving it, giving 42. Then the method `snufkin' is looked up
| in the Fixnum class and invoked with `self' bound to 42.

That give us the value of the Fixnum, where we can find it's real value,
as you explained, and on this case the method (snufkin) is looked up on
a generic (or "internal") Fixnum class, like I supposed String above,
not on the reference (the reference is the own object) as above right?
Kind of

n1, n2 =3D 1, 2

~ +-------+ +---------+ +-------+
~ | value | | methods | | value |
~ +-------+ +---------+ +-------+
n1 =3D | 1 +---| succ +---+ 2 | =3D n2
~ +-------+ +---------+ +-------+

So we can make math right with it's value, as Robert said, not needing
to make an overhead with methods and so on, right?
Robert answered me about this but his email is on my mailbox on my job. :=
-(

| Okay, I realize that this explanation turned out pretty confusing.
| For more information, you might search for ``ruby immediate values.''

No, I really appreciated it. Thank you very much. Sorry for bore you
guys with this kind of thing but I'm really curious. :)

I searched for that on the web, but I really needed some more "deeper"
info, because there's a lot of places with just replicate the pickaxe
contents. :)

Best regards,

- ----------------------------
Eust=E1quio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usu=E1rio GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCxfEtb6UiZnhJiLsRApkXAJ0a1xSF7bmj19Athamio4BO0W69CgCeLsHN
BHaE1nxEz5IpquPKw2cRiVA=3D
=3Do8UT
-----END PGP SIGNATURE-----
 
D

Daniel Brockman

Hey Daniel. :)

| Later, when you say this,
| foo.moomin
| Ruby sees that `foo' is divisible by four and so must be a pointer
| to an object, and a regular method call is made.

A regular method is called on the object at a memory location
allocated for the object value OR for value and a representation of
the object class, with methods and so on?

What is pointed by, let's say, a variable who stores a reference to
a String like "this is a long string where I can write a lot of
things" is just the string value or a complete package with it's
value and methods?

Basically just the string data and a pointer to the String class.
What will be released by the garbage collector is the answer of the
question above (memory allocated for values OR for values and
methods), when is not needed anymore, right?

Just the memory for the data. The methods are still needed because
there are other strings around.
For example on

s1, s2, s3, s4 =3D "this","is","a","test"

s1, s2, s3, s4 shares the String methods and their references points
to memory locations only with "this" "is" "a" "test", querying the
String methods on a String representation

Well, a String object is not just a character array,

struct RBasic {
unsigned long flags;
VALUE klass;
};

struct RString {
struct RBasic basic;
long len;
char *ptr;
union {
long capa;
VALUE shared;
} aux;
};

but on the other hand, the list of method pointers is certainly not
duplicated for each and every string out there.

[Good call showing actual source code, Dominik.]
OR

the reference to "this" & object methods, "is" & object methods, "a"
& object methods and "test" & object methods?

No, this is not how it works. That would be terribly inefficient.


[...]
I think it should be first one, specially when we change the String
class at runtime and it changes all the String references, but I'm
asking just to be sure. :)

You are correct. :)
| bar.snufkin
| it sees that `bar' contains 85, which is an odd integer --- not
| divisible by four. The oddness means that `bar' is an immediate
| Fixnum, so Ruby converts the value to its real meaning by
| subtracting one and halving it, giving 42. Then the method
| `snufkin' is looked up in the Fixnum class and invoked with `self'
| bound to 42.
=20
That give us the value of the Fixnum, where we can find it's real
value, as you explained, and on this case the method (snufkin) is
looked up on a generic (or "internal") Fixnum class, like I supposed
String above, not on the reference (the reference is the own object)
as above right? Kind of
=20
n1, n2 =3D 1, 2
=20
+-------+ +---------+ +-------+
| value | | methods | | value |
+-------+ +---------+ +-------+
n1 =3D | 1 +---| succ +---+ 2 | =3D n2
+-------+ +---------+ +-------+

No, what happens is more like this:

+---------+
n1 =3D 0x00000003 | Fixnum |
+---------+
n2 =3D 0x00000005 | succ |
+---------+

The `n1' and `n2' variables don't reference any _actual_ objects.
Ruby pretends the objects are there so we can feel warm and fuzzy.
It would be too inefficient to allocate a separate memory block for
each and every small integer out there.
So we can make math right with it's value, as Robert said, not
needing to make an overhead with methods and so on, right? Robert
answered me about this but his email is on my mailbox on my job. :-(

Right. Converting plain C integer to mangled Ruby fixnum (INT2FIX) or
the other way around (FIX2INT) is really cheap.
| Okay, I realize that this explanation turned out pretty confusing.
| For more information, you might search for ``ruby immediate values.''
=20
No, I really appreciated it. Thank you very much. Sorry for bore you
guys with this kind of thing but I'm really curious. :)

I'm glad I could be of help. If this subject truly bored me, I would
probably not have responded to your question at all. The truth is
that I needed to freshen up a bit on this myself. :)
I searched for that on the web, but I really needed some more
"deeper" info, because there's a lot of places with just replicate
the pickaxe contents. :)

Ah, yes, that's annoying. Man pages, in particular, tend to be
replicated across so many pages that they pollute search results.
One copy is quite enough please.

--=20
Daniel Brockman <[email protected]>

So really, we all have to ask ourselves:
Am I waiting for RMS to do this? --TTN.
 
E

Eustáquio Rangel de Oliveira Jr.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey Daniel.

| Basically just the string data and a pointer to the String class.

Fine!

| Just the memory for the data. The methods are still needed because
| there are other strings around.

Ok!

| No, this is not how it works. That would be terribly inefficient.

Yes, I know, just asked to be sure ehehe. :)

|>I think it should be first one, specially when we change the String
|>class at runtime and it changes all the String references, but I'm
|>asking just to be sure. :)
| You are correct. :)

Cool! :)

| No, what happens is more like this:
| n1 =3D 0x00000003 | Fixnum |
| n2 =3D 0x00000005 | succ |
| +---------+
| The `n1' and `n2' variables don't reference any _actual_ objects.
| Ruby pretends the objects are there so we can feel warm and fuzzy.
| It would be too inefficient to allocate a separate memory block for
| each and every small integer out there.

Perfect!

| The truth is
| that I needed to freshen up a bit on this myself. :)

It's what is happening to me trying to understand it more deeper. I
talked with some guys about Fixnum and they asked (and will ask more)
how it works internally, so it's curious how we learn (and remember, on
your case) about things talking about them. :)

| Ah, yes, that's annoying. Man pages, in particular, tend to be
| replicated across so many pages that they pollute search results.
| One copy is quite enough please.

I have a 150 pages Ruby tutorial here and will write about what we
talked here. At least it can be a little different from the rest.

Thanks Daniel, Dominik and Robert for your explanation about that. I
think I got the idea of how it works. :)

Best regards,

- ----------------------------
Eust=C3=A1quio "TaQ" Rangel
(e-mail address removed)
http://beam.to/taq
Usu=C3=A1rio GNU/Linux no. 224050
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCxmQFb6UiZnhJiLsRAiy7AJ9jkz0L1eFqdMVz10bh6XZKSoILtwCfaryj
cBmMrMflZRarSQoruxtpKgw=3D
=3DpX4c
-----END PGP SIGNATURE-----
 
D

Daniel Brockman

| The truth is that I needed to freshen up a bit on this=20 myself. :)

It's what is happening to me trying to understand it more=20 deeper.
I talked with some guys about Fixnum and they asked (and will=20
ask more) how it works internally, so it's curious how we learn=20
(and remember, on your case) about things talking about=20
them. :)

Indeed, teaching is probably the best way to learn. Like a comb=20
brushing through hair, a good student forces every strand of your=20
knowledge into perfection.
| Ah, yes, that's annoying. Man pages, in particular, tend to=20
| be replicated across so many pages that they pollute search=20
| results. One copy is quite enough please.

I have a 150 pages Ruby tutorial here and will write about what=20
we talked here. At least it can be a little different from the=20
rest.

Sounds cool, good luck with it!
Thanks Daniel, Dominik and Robert for your explanation about=20 that.
I think I got the idea of how it works. :)

You're welcome of course. :)

--=20
Daniel Brockman <[email protected]>
 
D

Daniel Brockman

Whoa, I have no idea what happened to that message.
I apologize for the garbled lines.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top