a Hash and a Set (sitting in a tree)

  • Thread starter Shot (Piotr Szotkowski)
  • Start date
S

Shot (Piotr Szotkowski)

--tgO4DUqb0GCKXXLG
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

(For those keeping track at home of my Ruby vs. Ruby/C vs. C++
adventures, I decided to go with pure Ruby and optimise afterwards
if/where needed and my supervisor agreed. I also started coding
yesterday, and would like to state that (a) Ruby test-driven development
is even more sexy than advertised and (b) humblelittlerubybook.com is
a really great book.)

On to my problem, then. :) First, the terminology: a block is a set of
vectors (say, integers), a blanket is a set of blocks, and a blanket
might be, but in most cases isn=E2=80=99t, encoded =E2=80=93 i.e., might ha=
ve its blocks
named. Thus an example, unencoded two-block blanket might look like
this: {{1, 9, 4, 7}, {1, 9, 3, 7}}

My obvious first approach was to create a Block class with two
properties, a @vectors Set and an @encoding Symbol, and a Blanket
class with a @blocks Set.

Now that I think about it, though, most of the computing will happen
with unencoded blankets, and in this case there=E2=80=99s no need for a Blo=
ck
class at all; just a Blanket with a Set of Sets. First, this (I presume)
would make things a bit faster and less memory hungry; second, the
encoding makes sense only at the Blanket level, really.

Ufortunately, an encoded blanket screams to be represented as a Hash of
Sets (keyed with the encoding Symbols), while an unencoded blanket is
best represented as a Set of Sets, and can=E2=80=99t be a Hash.=20

What would be the best approach to this situation? An EncodedBlanket
subclass of Blanket? A Blanket class that is a Hash of Symbols keyed
with Sets (i.e., the other way around, with nil hash values for
unencoded blankets)? Something else? Basically, what I need is
a hash that doesn=E2=80=99t have to have keys=E2=80=A6

-- Shot
--=20
Can't you feel the peace and contentment in this block of code? Ruby
is the language Buddha would have programmed in. -- Sean Russell

--tgO4DUqb0GCKXXLG
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFFtypji/mCfdEo8UoRAu1wAKCaflIbvGee7dDRh8uqhrCk1fOXcwCeIbs4
r2zh1UFwXIHGEywH5hZSOH8=
=XavL
-----END PGP SIGNATURE-----

--tgO4DUqb0GCKXXLG--
 
R

Robert Klemme

(For those keeping track at home of my Ruby vs. Ruby/C vs. C++
adventures, I decided to go with pure Ruby and optimise afterwards
if/where needed and my supervisor agreed. I also started coding
yesterday, and would like to state that (a) Ruby test-driven development
is even more sexy than advertised and (b) humblelittlerubybook.com is
a really great book.)

On to my problem, then. :) First, the terminology: a block is a set of
vectors (say, integers), a blanket is a set of blocks, and a blanket
might be, but in most cases isn’t, encoded – i.e., might have its blocks
named. Thus an example, unencoded two-block blanket might look like
this: {{1, 9, 4, 7}, {1, 9, 3, 7}}

My obvious first approach was to create a Block class with two
properties, a @vectors Set and an @encoding Symbol, and a Blanket
class with a @blocks Set.

Now that I think about it, though, most of the computing will happen
with unencoded blankets, and in this case there’s no need for a Block
class at all; just a Blanket with a Set of Sets. First, this (I presume)
would make things a bit faster and less memory hungry; second, the
encoding makes sense only at the Blanket level, really.

Ufortunately, an encoded blanket screams to be represented as a Hash of
Sets (keyed with the encoding Symbols), while an unencoded blanket is
best represented as a Set of Sets, and can’t be a Hash.

What would be the best approach to this situation? An EncodedBlanket
subclass of Blanket? A Blanket class that is a Hash of Symbols keyed
with Sets (i.e., the other way around, with nil hash values for
unencoded blankets)? Something else? Basically, what I need is
a hash that doesn’t have to have keys…

Initially I would defer considerations of performance. I'd start with
the cleanest design. At the moment I don't have any clue as to what
processing you are doing so that would probably determine how I would
set up classes. The most straightforward way would probably be to have
a BaseBlanket, an EncodedBlanket, a PlainBlanket (unencoded) and a Block.

Whether the BaseBlanket makes sense is probably determined by how much
an EncodedBlanket and a PlainBlanket have in common, i.e. how much
functionality they share. That then would go into the BaseBlanket.

Kind regards

robert
 
S

Shot (Piotr Szotkowski)

--5cJDuji44TeXTx+X
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Thanks a lot for your reply, Robert!

Robert Klemme:
Initially I would defer considerations of performance. I'd start with
the cleanest design. At the moment I don't have any clue as to what
processing you are doing so that would probably determine how I would
set up classes.

Mostly basic set operations on the blocks; Set#&, Set#|,
Set#- and Set#^ seem to cover quite a lot of the base.
The most straightforward way would probably be to have a BaseBlanket,
an EncodedBlanket, a PlainBlanket (unencoded) and a Block.

With this approach (i.e., moving the encoding information to the blanket
level, which actually makes sense) Block would be just a Set, nothing
more.
Whether the BaseBlanket makes sense is probably determined by how much
an EncodedBlanket and a PlainBlanket have in common, i.e. how much
functionality they share. That then would go into the BaseBlanket.

In theory, EncodedBlanket is just a Blanket with its blocks named,
so a Blanket class with an EncodedBlanket subclass should cover it.

The catch is, an unencoded blanket can be represented as a Set of
blocks, while and encoded blanket can=E2=80=99t; conversely, an encoded bla=
nket
can be represented as a Hash of (encoding =3D> block) pairs, while an
unencoded blanket can=E2=80=99t.

Maybe I should go with a single Blanket class with an Array of blocks
and a =E2=80=98synchronised=E2=80=99 Array of encodings that would simply b=
e nil for
unencoded blankets? I don=E2=80=99t need the ordering Array would bring in,
but a common class for un- and encoded blankets would simplify things
greatly.

-- Shot
--=20
May the --force be with you. -- <[tafkam]>

--5cJDuji44TeXTx+X
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFFt0Ybi/mCfdEo8UoRAhTOAKCTRtofIQyfB4lLRKukrrnvKGpv4ACfS5SR
d9VJ6qVpHuWhE8o8GpGyV6Y=
=ObD1
-----END PGP SIGNATURE-----

--5cJDuji44TeXTx+X--
 
R

Robert Klemme

Thanks a lot for your reply, Robert!

Robert Klemme:


Mostly basic set operations on the blocks; Set#&, Set#|,
Set#- and Set#^ seem to cover quite a lot of the base.

And how do you access or find blocks (and blankets)?
With this approach (i.e., moving the encoding information to the blanket
level, which actually makes sense) Block would be just a Set, nothing
more.

Well, but with a block you get clear semantics at the price of a low
overhead. The way you speak of blocks seems to indicate to me that
there is more to them which then would make them a good thing to have.
At least it's much easier to later add methods to Block than changing
the whole code from Set to Blanket if at some point in time you discover
that you need additional methods.
In theory, EncodedBlanket is just a Blanket with its blocks named,
so a Blanket class with an EncodedBlanket subclass should cover it.
Maybe.

The catch is, an unencoded blanket can be represented as a Set of
blocks, while and encoded blanket can’t; conversely, an encoded blanket
can be represented as a Hash of (encoding => block) pairs, while an
unencoded blanket can’t.

You should probably try to look at it more from the interface
perspective: you talk a lot about internal representation while I think
it's more important to get the interface straight. Questions I would as
are: what entities are there? How are they connected? How are they
used? etc. (You could do this on a whiteboard or with CRC cards as well.)
Maybe I should go with a single Blanket class with an Array of blocks
and a ‘synchronised’ Array of encodings that would simply be nil for
unencoded blankets? I don’t need the ordering Array would bring in,
but a common class for un- and encoded blankets would simplify things
greatly.

In that case it's you could also have a single Array containing two
element Arrays (Blanket, Encoding). Whether to use that, your version
or a Hash is mostly determined by how you access (i.e. do you need the
encoding as a key?).

Kind regards

robert
 
S

Shot (Piotr Szotkowski)

--qLni7iB6Dl8qUSwk
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Robert Klemme:
Well, but with a block you get clear
semantics at the price of a low overhead.

The catch was, I considered even just a 10% overhead to be worth
engineering the classes in a simpler, if less-obvious way. First,
though, I figured that if test-driven Ruby development was so easy
to learn, maybe benchmarking in Ruby is also easy, and I came with
these: http://shot.pl/blankets/

The results suggest the overhead is negligible (to the point of
SetBlanket being often slower than Blanket), but if you (or somebody
else) could take a quick look whether the test is worth anything I would
be most grateful.
The way you speak of blocks seems to indicate to me that there is more
to them which then would make them a good thing to have. At least it's
much easier to later add methods to Block than changing the whole code
from Set to Blanket if at some point in time you discover that you
need additional methods.

I agree, and I concede my previous approaches. :)

Now that I thought of it, Block could simply be a subclass of Set with
an additional instance variable of @encoding. [Googles a bit, finds
a way for a class to call it=E2=80=99s parent=E2=80=99s constructor, rejoic=
es.]
Something to the point of

class Block < Set
def initialize enum =3D [], encoding =3D nil
super enum
@encoding =3D encoding
end
end

Any pitfalls a Ruby newbie should be aware of with such an approach?
You should probably try to look at it more from the interface
perspective: you talk a lot about internal representation while
I think it's more important to get the interface straight.

The problem is that I need a base library with a Blanket class that
I could experiment on; I=E2=80=99m not yet sure what the final decomposition
algorithm will look like, I need a Blanket class that I could twist
and bend (almost all the algorithm parts are algebra operations on the
blankets as sets of blocks and blocks as sets of integers).
Questions I would as are: what entities are there?
How are they connected? How are they used? etc.

The blankets are sets of blocks which are sets of integers. Sometimes
the blankets are encoded, i.e., their blocks have labels. I need to be
able to multiply, merge and compare the blankets.

Thanks a lot for your time and your
answers, Robert; I really appreciate it!

-- Shot
--=20
It is difficult to produce a television documentary that is both
incisive and probing when every twelve minutes one is interrupted by
twelve dancing rabbits singing about toilet paper. -- Rod Serling

--qLni7iB6Dl8qUSwk
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFFu4VQi/mCfdEo8UoRAqUnAJ4wOkzAQ21mQdxCV27OGZIL0PMtJACfReBt
1QaGK56dDydStMT22sJM1uk=
=SA9V
-----END PGP SIGNATURE-----

--qLni7iB6Dl8qUSwk--
 
R

Robert Klemme

Robert Klemme:


The catch was, I considered even just a 10% overhead to be worth
engineering the classes in a simpler, if less-obvious way. First,
though, I figured that if test-driven Ruby development was so easy
to learn, maybe benchmarking in Ruby is also easy, and I came with
these: http://shot.pl/blankets/

Btw, you can more easily create your Arrays of randoms with the block form:

irb(main):001:0> Array.new(5) { rand 8 }
=> [0, 5, 6, 6, 6]

Much less typing. :)
Now that I thought of it, Block could simply be a subclass of Set with
an additional instance variable of @encoding. [Googles a bit, finds
a way for a class to call it’s parent’s constructor, rejoices.]
Something to the point of

class Block < Set
def initialize enum = [], encoding = nil
super enum
@encoding = encoding
end
end

Any pitfalls a Ruby newbie should be aware of with such an approach?

Hm... It seems generally considered that delegation is preferred over
inheritance with basic classes like Set. You have to be at least aware
that you inherit *all* methods of Set - this might or might not be what
you want. In this case I cannot see a Set method that you might not
want to inherit but one should be aware of this.
Thanks a lot for your time and your
answers, Robert; I really appreciate it!

You're welcome!

Kind regards

robert
 
S

Shot (Piotr Szotkowski)

--liqSWPDvh3eyfZ9k
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Robert Klemme:
Btw, you can more easily create your
Arrays of randoms with the block form:
irb(main):001:0> Array.new(5) { rand 8 }
=3D> [0, 5, 6, 6, 6]
Much less typing. :)

Thanks! The randomizer was a last-minute addition, in the middle of
writing the previous email, hence the clumsy approach. This being Ruby,
I knew there should be a more elegant way. :)
Hm... It seems generally considered that delegation is
preferred over inheritance with basic classes like Set.

I=E2=80=99m still trying to wrap my mind around delegation, but both the RD=
oc
and Pitchfork One examples seem too clever for me. Jay Fields=E2=80=99s Ruby
Delegation Module[1], on the other hand, looks quite obvious=E2=80=A6

[1] http://jayfields.blogspot.com/2006/03/ruby-delegation-module.html
You have to be at least aware that you inherit *all* methods of Set
- this might or might not be what you want. In this case I cannot
see a Set method that you might not want to inherit but one should
be aware of this.

In the case of blocks, I think inheritance doesn=E2=80=99t have any drawbac=
ks.
I might consider delegation (to Set) with the Blanket class, though.
Thanks again for pointing in the right (and/or interesting) directions!

-- Shot
--=20
The above comment may be extremely inflamatory. For your protection,
it has been rot13'd twice. -- JWhitlock, /.

--liqSWPDvh3eyfZ9k
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFFu9Mwi/mCfdEo8UoRArEqAKCquFBdMbs60Fv9r9lCQ9hLJAuRyQCfftrH
wuf6xS40tWAlBV5653fqwz4=
=m4cj
-----END PGP SIGNATURE-----

--liqSWPDvh3eyfZ9k--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top