Modified Single Table Inheritance

Discussion in 'Ruby' started by Ryan Glover, Dec 26, 2006.

  1. Ryan Glover

    Ryan Glover Guest

    Hello,

    I am developing an application that will have 100's of model classes
    each derived from a single source class. Each model will have 4 similar
    attributes but then have about a dozen unique ones each. I looked into
    using STI but balked at creating tables with a 1000 columns. I then
    looked at using inherits_from and creating an new table for each class
    that holds the new class columns. Problem is, I'll end up with 100's of
    tables, which also does not appeal to me.

    My third solution, the one I am asking advice on, is this.

    I create a single table with the 4 common columns and a dozen columns
    with names such as float1, float2, float3, ... , string1, string2, ..
    etc. Creating enough columns of each type to cover my largest subclass.
    Then I use STI and for each class I map the generic columns to the nicer
    names inside the subclasses. So in one class float1 may be miles/hour
    while in another class it might be turkeys/hectare.

    Does this sound like a reasonable approach?

    Also, how would I map the generic names to nice names in my subclasses?
    (I am a RoR noob)

    And of course, does anyone see a much better way to do what I have
    proposed?

    Thank you,
    Ryan Glover

    --
    Posted via http://www.ruby-forum.com/.
     
    Ryan Glover, Dec 26, 2006
    #1
    1. Advertising

  2. Ryan Glover

    Eric Hodel Guest

    On Dec 26, 2006, at 14:51, Ryan Glover wrote:

    > I am developing an application that will have 100's of model classes
    > each derived from a single source class. Each model will have 4
    > similar
    > attributes but then have about a dozen unique ones each. I looked
    > into
    > using STI but balked at creating tables with a 1000 columns. I then
    > looked at using inherits_from and creating an new table for each class
    > that holds the new class columns. Problem is, I'll end up with
    > 100's of
    > tables, which also does not appeal to me.


    I think you want the Rails mailing list.

    --
    Eric Hodel - - http://blog.segment7.net

    I LIT YOUR GEM ON FIRE!
     
    Eric Hodel, Dec 26, 2006
    #2
    1. Advertising

  3. Ryan Glover

    Mat Schaffer Guest

    On Dec 26, 2006, at 5:51 PM, Ryan Glover wrote:
    > My third solution, the one I am asking advice on, is this.
    >
    > I create a single table with the 4 common columns and a dozen columns
    > with names such as float1, float2, float3, ... , string1, string2, ..
    > etc. Creating enough columns of each type to cover my largest
    > subclass.
    > Then I use STI and for each class I map the generic columns to the
    > nicer
    > names inside the subclasses. So in one class float1 may be miles/hour
    > while in another class it might be turkeys/hectare.
    >
    > Does this sound like a reasonable approach?
    >
    > Also, how would I map the generic names to nice names in my
    > subclasses?
    > (I am a RoR noob)
    >
    > And of course, does anyone see a much better way to do what I have
    > proposed?


    Even though we're not technically on the RoR list, I thought this was
    still an interesting question for ORM, STI, OOP and probably some
    other acronyms too :)

    I'd be curious to hear just what it is that you're doing that needs
    100s of classes, first. Sounds like you might be taking OOP a bit
    too far for your purposes.

    But to answer your question, I like the concept but it brings up fear
    of bugs. Dropping meaningful column names will wreak havoc on any
    cases where you have to write a query by hand. But everything is a
    trade off. Personally, I'd try the 100s of tables route first and
    see how it panned out.

    Best of luck. Sounds like a tricky one.
    -Mat
     
    Mat Schaffer, Dec 27, 2006
    #3
  4. --------------enig769AF9E99DE70C00A818BF51
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable

    Ryan Glover wrote:
    > Hello,
    >=20
    > I am developing an application that will have 100's of model classes
    > each derived from a single source class. Each model will have 4 similar=


    > attributes but then have about a dozen unique ones each.


    Why are they deriving from a single class anyway?

    If it's only for a few identical attributes without significant
    functionality to go along, I'd say it's a bad idea to use inheritance.
    And with latent typing in Ruby, you Rarely If Ever need inheritance
    anyway. Saves you the bother that is inheritance mapping at any rate.

    > Problem is, I'll end up with 100's of
    > tables, which also does not appeal to me.
    >=20


    I can't understand this. It's rather common that an ORM database schema
    has a number of tables in the same order of magnitude as there are model
    classes. My biased opinion is that data should be modelled on the
    database level where you have some theorethical foundations about what
    are "good shapes" for the data. It also doesn't hurt to at least make
    the DB schema somewhat document the data model - the data an application
    gathers / stores is the most likely part of it to persist in the long
    term. Of course, you're probably in a better position to judge this.

    > My third solution, the one I am asking advice on, is this.
    >=20
    > I create a single table with the 4 common columns and a dozen columns
    > with names such as float1, float2, float3, ... , string1, string2, ..
    > etc. Creating enough columns of each type to cover my largest subclass=

    =2E
    > Then I use STI and for each class I map the generic columns to the nice=

    r
    > names inside the subclasses. So in one class float1 may be miles/hour
    > while in another class it might be turkeys/hectare.
    >=20
    > Does this sound like a reasonable approach?
    >=20


    Database schemas like this tend to make it into The Daily WTF with some
    regularity.

    (Unfortunately, they also make it into production systems.)

    If there's a chance someday, someone, somewhy might want to access your
    data at a DB level (using a reporting tool maybe), there'd be no end to
    the grief.

    Or, if a bug in your code gets the database into an inconsistent state,
    and you have to drop down to SQL to patch things up.

    Avoid.

    > And of course, does anyone see a much better way to do what I have
    > proposed?
    >=20


    Use migrations to create the schema from scratch if you just want to
    avoid writing a lot of SQL up front?

    David Vallner


    --------------enig769AF9E99DE70C00A818BF51
    Content-Type: application/pgp-signature; name="signature.asc"
    Content-Description: OpenPGP digital signature
    Content-Disposition: attachment; filename="signature.asc"

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (MingW32)

    iD8DBQFFkxP8y6MhrS8astoRArEBAJ9HDubMhGUbZHdfCV+nscweMInkjACffcfQ
    vBgA0l4e+GbXt3GUNlmTtBE=
    =07NB
    -----END PGP SIGNATURE-----

    --------------enig769AF9E99DE70C00A818BF51--
     
    David Vallner, Dec 28, 2006
    #4
  5. Ryan Glover

    Ryan Glover Guest

    Hello David,

    > Why are they deriving from a single class anyway?


    All the objects are the same thing, so to speak. As a silly example of
    what I am trying to do, let's say that each user an array named fruit.
    There are many types of fruit and each have their own unique attributes
    (# of grapes, fuzziness index of peach, banana radius) but they are all
    fruit. My program is just like this, except I have hundreds of fruit.
    Perhaps even a thousand. Ideally I would like to be able to add more
    fruit as the program evolves just by adding a new fruit subclass and a
    new fruit view.

    >
    > I can't understand this. It's rather common that an ORM database schema
    > has a number of tables in the same order of magnitude as there are model
    > classes. My biased opinion is that data should be modelled on the
    > database level where you have some theorethical foundations about what
    > are "good shapes" for the data. It also doesn't hurt to at least make
    > the DB schema somewhat document the data model - the data an application
    > gathers / stores is the most likely part of it to persist in the long
    > term. Of course, you're probably in a better position to judge this.
    >


    I guess my problem with 100's perhaps a 1000 tables is db performance.
    I don't have oracle, only mysql or postgres. Performance wise, wouldnt
    it be better to have a few thousand rows in one table as opposed to a
    few dozen spread across 100's of tables?


    > Database schemas like this tend to make it into The Daily WTF with some
    > regularity.
    >
    > (Unfortunately, they also make it into production systems.)
    >
    > If there's a chance someday, someone, somewhy might want to access your
    > data at a DB level (using a reporting tool maybe), there'd be no end to
    > the grief.
    >
    > Or, if a bug in your code gets the database into an inconsistent state,
    > and you have to drop down to SQL to patch things up.
    >
    > Avoid.


    I appreciate all these warnings and I understand what you are saying.
    In this case I am client for the project so I'll take my lumps as they
    come.

    Thanks for your response,
    Ryan Glover



    --
    Posted via http://www.ruby-forum.com/.
     
    Ryan Glover, Dec 28, 2006
    #5
  6. --------------enig2D215BDA840148A7A161469D
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable

    Ryan Glover wrote:
    >> I can't understand this. It's rather common that an ORM database schem=

    a
    >> has a number of tables in the same order of magnitude as there are mod=

    el
    >> classes. My biased opinion is that data should be modelled on the
    >> database level where you have some theorethical foundations about what=


    >> are "good shapes" for the data. It also doesn't hurt to at least make
    >> the DB schema somewhat document the data model - the data an applicati=

    on
    >> gathers / stores is the most likely part of it to persist in the long
    >> term. Of course, you're probably in a better position to judge this.
    >>

    >=20
    > I guess my problem with 100's perhaps a 1000 tables is db performance. =


    > I don't have oracle, only mysql or postgres. Performance wise, wouldnt=

    =20
    > it be better to have a few thousand rows in one table as opposed to a=20
    > few dozen spread across 100's of tables?
    >=20


    The history of computing knows only one answer to that question:
    benchmark. (And not even that's much of an answer.)

    However, I have very, very few doubts that separate tables would be
    faster in a read-mostly scenario. The records for that model are
    filtered at insert time instead of query time, and deciding between
    tables is probably way faster than on any ad hoc type index column.

    Generally, a well-designed schema will probably buy you more performance
    in the long run than any sort of peephole hand-optimizations. You want
    to minimize the number of round-trip hits on the database and the volume
    of the datasets retrieved for that (often conflicting goals), and a
    clear schema helps do just that.

    David Vallner


    --------------enig2D215BDA840148A7A161469D
    Content-Type: application/pgp-signature; name="signature.asc"
    Content-Description: OpenPGP digital signature
    Content-Disposition: attachment; filename="signature.asc"

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (MingW32)

    iD8DBQFFkyHHy6MhrS8astoRAuvFAJsHncs5yrEvFhi87+HgFGWuile6mgCeI/kw
    PqBjbS9atsfF2C6nhUUTLsg=
    =vwRq
    -----END PGP SIGNATURE-----

    --------------enig2D215BDA840148A7A161469D--
     
    David Vallner, Dec 28, 2006
    #6
  7. Ryan Glover

    Mat Schaffer Guest

    On Dec 27, 2006, at 8:19 PM, Ryan Glover wrote:
    > All the objects are the same thing, so to speak. As a silly
    > example of
    > what I am trying to do, let's say that each user an array named fruit.
    > There are many types of fruit and each have their own unique
    > attributes
    > (# of grapes, fuzziness index of peach, banana radius) but they are
    > all
    > fruit. My program is just like this, except I have hundreds of fruit.
    > Perhaps even a thousand. Ideally I would like to be able to add more
    > fruit as the program evolves just by adding a new fruit subclass and a
    > new fruit view.


    Well, I'm guessing fruit is just an example. But I think you could
    still group the fruits into subcategories or something to that effect
    and end up with reasonable inheritance. For example, berries
    (strawberry, raspberry) , pitted fruits (peach, concord grape), not
    sure what you'd call a banana but I'd bet an encyclopedia would have
    some info. But I think you get the idea. To get the specific kind
    of fruit, you could just use a type field and get single table
    inheritance on the subcategory.

    It'll depend on your data model. But I'm inclined to agree with the
    rest of group that there are either relationships or duck-typing
    advantages that you're not utilizing.

    And David has a point of schemas that include things like float1,
    etc... do often show up on the daily WTF. That's might be reason
    enough to stay away.

    Thinking another way, what about having a fruit table and a trait
    table. The trait table could reference a large number of
    trait_types. The trait table would reference a fruit, a trait_type
    and a value. You could probably get away with just 3 or 4 value
    columns in the trait table. Just a thought.
    -Mat
     
    Mat Schaffer, Dec 28, 2006
    #7
  8. Ryan Glover

    Paulo Köch Guest

     
    Paulo Köch, Dec 28, 2006
    #8
  9. Ryan Glover wrote:
    > Hello David,
    >
    >> Why are they deriving from a single class anyway?

    >
    > All the objects are the same thing, so to speak. As a silly example of
    > what I am trying to do, let's say that each user an array named fruit.
    > There are many types of fruit and each have their own unique attributes
    > (# of grapes, fuzziness index of peach, banana radius) but they are all
    > fruit. My program is just like this, except I have hundreds of fruit.
    > Perhaps even a thousand. Ideally I would like to be able to add more
    > fruit as the program evolves just by adding a new fruit subclass and a
    > new fruit view.
    >


    Deriving a subclass for each type of different fruit isn't a scalable or
    viable solution.

    I presume your app will use a database, so why not use it to do all the
    work.

    A table for fruit containing all the attributes that remain constant
    accross all fruit, (name, origin perhaps).

    A table of attributes: diameter, colour, fuzziness index, length,
    quantity_in_bunch. There could also be a field on this table to
    determine the type: numeric, string, boolean, etc.

    A join table to determine which fruit get which attribute.

    then when you get an order, the app generates an ordered fruit attribute
    for each attribute for that particular fruit and prompts the user to
    fill in the data.

    Any new fruit that come along in the future just get added to the
    database with the correct attributes, and there is no extra work to do
    in the app - it's all there.


    Does that sort of architecture meet the requirement?

    sr
     
    Stephen Richards, Dec 31, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Randall Parker
    Replies:
    1
    Views:
    608
    S. Justin Gengo
    Dec 12, 2005
  2. Chris
    Replies:
    1
    Views:
    13,644
    Oisin
    Mar 24, 2006
  3. Tobias
    Replies:
    1
    Views:
    155
    Tobias
    Jan 21, 2007
  4. Mike
    Replies:
    1
    Views:
    131
    Phrogz
    Mar 12, 2007
  5. Mothra
    Replies:
    10
    Views:
    478
    Anno Siegel
    Mar 11, 2005
Loading...

Share This Page