Why are arrays and hashes this way?

Discussion in 'Perl Misc' started by Xavier Noria, Apr 11, 2004.

  1. Xavier Noria

    Xavier Noria Guest

    When I introduce references the first thing I mention is that they
    allow us to build nested structures. However, the importance of that
    feature is a consequence of the fact that structures cannot be nested
    themselves.

    Does anybody know why structures were designed so that they could just
    hold scalars?

    -- fxn
     
    Xavier Noria, Apr 11, 2004
    #1
    1. Advertising

  2. Xavier Noria

    Ala Qumsieh Guest

    Xavier Noria wrote:

    > Does anybody know why structures were designed so that they could just
    > hold scalars?


    AFAIK, that was how Larry Wall originally designed it, and it stayed
    this way for backward compatibility. The introduction of reference with
    Perl5 was specifically targeted at adding the ability to build nested
    data structures.

    --Ala
     
    Ala Qumsieh, Apr 11, 2004
    #2
    1. Advertising

  3. Xavier Noria

    Uri Guttman Guest

    >>>>> "AQ" == Ala Qumsieh <> writes:

    AQ> Xavier Noria wrote:
    >> Does anybody know why structures were designed so that they could just
    >> hold scalars?


    AQ> AFAIK, that was how Larry Wall originally designed it, and it stayed
    AQ> this way for backward compatibility. The introduction of reference
    AQ> with Perl5 was specifically targeted at adding the ability to build
    AQ> nested data structures.

    and even without that, it makes very good sense. the problem with
    storing a real hash where a scalar is, is how do you store it? the slot
    in an SV can hold a single item (a scalar) so what would you put there
    to represent a hash? and if any of those hash elements was a hash, all
    memory hell breaks out. in c, you can only do multidim arrays of known
    element size. with perl you can have each thing at any level be any
    thing of any size. so the win is major flexibility at a cost of
    understanding and dealing with refs. not a bad tradeoff IMO.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Apr 11, 2004
    #3
  4. Uri Guttman wrote:
    >>>>>> "AQ" == Ala Qumsieh <> writes:

    >
    >> Xavier Noria wrote:
    > >> Does anybody know why structures were designed so that they

    > could just >> hold scalars?
    >
    > and even without that, it makes very good sense. the problem with
    > storing a real hash where a scalar is, is how do you store it? the
    > slot in an SV can hold a single item (a scalar) so what would you put
    > there to represent a hash? and if any of those hash elements was a
    > hash, all memory hell breaks out. in c, you can only do multidim
    > arrays of known element size. with perl you can have each thing at
    > any level be any thing of any size. so the win is major flexibility
    > at a cost of understanding and dealing with refs. not a bad tradeoff
    > IMO.


    First I was thinking the same but on second thought that is really an
    implementation detail of the compiler that can and should be hidden from the
    user of the language.
    While I agree that sometimes a language designer has to compromise because
    of implementation considerations, in general that should not be the guiding
    principle for designing a computer language.
    Otherwise we would still be stuck with C, a languages whos features can
    easily be translated one-to-one into assembler code. And which I refuse to
    call "higher level" because it is too close to the computer architecture,
    thus leaving the programmer with the burden of translating his problem into
    too low a detail and thinking in computer architecture terms instead of in
    problem area terms.
    This may be fine for implementing an operating system, but it is an
    unecessary burden for application programming.

    In some areas Perl went beyond C, e.g. wrt. memory management for arrays and
    hashes. They are just there, the programmer doesn't have to worry about
    pre-allocating memory, enlarging them, or releasing them when not needed any
    longer. It is a pitty that Larry didn't go all the way and eliminated the
    concept of pointers (aka references) altogether. Technically there is no
    need for them, the compiler could manage them automatically and hide them
    from the user.

    The one exception would be how to realize a call-by-reference if there are
    no references in the language. However even that can be solved on the
    language level by an additional keyword like e.g. "VAR" like in Pascal which
    would indicate to the compiler that this parameter is call-by-reference
    rather then call-by-value. Again, no need for an explicit notion ("&foobar")
    of pointers/references in the technical sense.

    jue
     
    Jürgen Exner, Apr 11, 2004
    #4
  5. On Sun, 11 Apr 2004 06:47:27 GMT, Uri Guttman <>
    wrote:

    > AQ> AFAIK, that was how Larry Wall originally designed it, and it stayed
    > AQ> this way for backward compatibility. The introduction of reference
    > AQ> with Perl5 was specifically targeted at adding the ability to build
    > AQ> nested data structures.
    >
    >and even without that, it makes very good sense. the problem with


    Indeed!

    As a side note, even if I have not had any exposure to Perl4 but in
    terms of corrections to old(-fashioned)/obsolete scripts, I'm
    fascinated by the way these extremely powerful features were added in
    a manner not only backwards-compatible with previous relases of the
    language, but even consistent with them, that is with Perl's basic
    syntax that we all, presumably, appreciate so much!

    >storing a real hash where a scalar is, is how do you store it? the slot
    >in an SV can hold a single item (a scalar) so what would you put there
    >to represent a hash? and if any of those hash elements was a hash, all
    >memory hell breaks out. in c, you can only do multidim arrays of known


    Well, as far as the UI is concerned the "look and feel" of references
    is exacly this, i.e., loosely speaking (about a subset of the meanings
    that can be given to refs), of "arrays and hashes that can be stored
    into a single scalar variable".

    IMHO the only situation when the fake nature of refs as nested
    structures becomes evident is with "copy": it's not so bad in the end,
    and we all do such things routinely either with our own handmade
    solutions or by means of a cloning module, but that's it!

    Now that I come to think of this, IMHO it would be fine if an
    assignement operator existed (what about ':=') that does an automatic
    recursive copy/cloning of its RHS. Or even better, an operator to
    return such a clone... what about:

    <-

    Hmmm, no, would be an hell for a parser to tell from "less than,
    minus",

    <--

    no, same cmt,

    <=

    no, already taken! (and context wouldn't help much here, I guess),

    <==

    Hmmm, well, what about this?

    my @AoA = ([1,0], [0,1]);
    my @new_AoA = <== @AoA;
    $new_AoA[0][1]=1; # it's now ([1,1], [0,1]), @AoA unchanged

    >element size. with perl you can have each thing at any level be any
    >thing of any size. so the win is major flexibility at a cost of
    >understanding and dealing with refs. not a bad tradeoff IMO.

    ^^^^^^^^^^^^^^^^^^^^^^

    Definitely, IMO too!

    Also, it seems that in Perl6 dealing with references will be made much
    more transparent, won't it?


    Michele,
    whose judgement capabilities may be strongly biased/injured by our
    traditional Easter lunch...
    --
    you'll see that it shouldn't be so. AND, the writting as usuall is
    fantastic incompetent. To illustrate, i quote:
    - Xah Lee trolling on clpmisc,
    "perl bug File::Basename and Perl's nature"
     
    Michele Dondi, Apr 11, 2004
    #5
  6. Xavier Noria

    pkent Guest

    In article <>,
    (Xavier Noria) wrote:

    > When I introduce references the first thing I mention is that they
    > allow us to build nested structures. However, the importance of that
    > feature is a consequence of the fact that structures cannot be nested
    > themselves.


    They can, indeed. I haven't yet read the other 4 posts showing up in
    this thread but I'll guess that they will point out that:

    a) references are scalars
    b) the values in hashes and arrays are scalars
    therefore, using some Ancient Greek logic :)
    c) hashes and arrays may contain other hashes or arrays by holding
    references to them

    See the perlreftut page for more.

    E.g.

    my %nest = (
    foo => [
    1, 2, { x => 'y'}, [ 'r' ]
    ],
    bar => {
    baz => 'qux',
    quux => [
    'l'
    ],
    }
    );
    print $nest{bar}{quux}[0] . "\n";
    print $nest{foo}[2]{x} . "\n";
    __END__

    P

    --
    pkent 77 at yahoo dot, er... what's the last bit, oh yes, com
    Remove the tea to reply
     
    pkent, Apr 11, 2004
    #6
  7. Xavier Noria

    pkent Guest

    In article <>,
    (Xavier Noria) wrote:

    > When I introduce references the first thing I mention is that they
    > allow us to build nested structures. However, the importance of that
    > feature is a consequence of the fact that structures cannot be nested
    > themselves.
    >
    > Does anybody know why structures were designed so that they could just
    > hold scalars?


    Oh, I see now. I misunderstood what you were getting at. Don't mind me,
    nothing to see here, move along now... :)

    P

    --
    pkent 77 at yahoo dot, er... what's the last bit, oh yes, com
    Remove the tea to reply
     
    pkent, Apr 11, 2004
    #7
  8. Xavier Noria

    Xavier Noria Guest

    Uri Guttman <> wrote in message news:<>...

    > and even without that, it makes very good sense. the problem with
    > storing a real hash where a scalar is, is how do you store it? the slot
    > in an SV can hold a single item (a scalar) so what would you put there
    > to represent a hash? and if any of those hash elements was a hash, all
    > memory hell breaks out. in c, you can only do multidim arrays of known
    > element size. with perl you can have each thing at any level be any
    > thing of any size. so the win is major flexibility at a cost of
    > understanding and dealing with refs. not a bad tradeoff IMO.


    In my opinion the reason cannot be only "because the slot is an SV".

    Why then arrays and hashes are data types that cannot be stored in
    SVs? I guess there was some choice made when those data types were
    defined that matters here. My question is why that initial choice was
    done like that. Efficiency? No particular reason but historical
    accident? Different goals than today for which those types were better
    suited?

    I think this is important to know. Being arrays and hashes first-class
    citizens it kind of surprises to newcomers (like me when I learned
    Perl5) that they cannot be nested. The class about structures should
    have some comment like "You see we have all these high-level
    structures so easily handled at the tips of our fingers, but because
    of <<historical reasons>> they cannot be nested. We'll learn how to do
    that when we see references."

    -- fxn
     
    Xavier Noria, Apr 12, 2004
    #8
  9. Xavier Noria

    Uri Guttman Guest

    >>>>> "XN" == Xavier Noria <> writes:

    XN> Uri Guttman <> wrote in message news:<>...

    >> and even without that, it makes very good sense. the problem with
    >> storing a real hash where a scalar is, is how do you store it? the slot
    >> in an SV can hold a single item (a scalar) so what would you put there
    >> to represent a hash? and if any of those hash elements was a hash, all
    >> memory hell breaks out. in c, you can only do multidim arrays of known
    >> element size. with perl you can have each thing at any level be any
    >> thing of any size. so the win is major flexibility at a cost of
    >> understanding and dealing with refs. not a bad tradeoff IMO.


    XN> In my opinion the reason cannot be only "because the slot is an SV".

    XN> I think this is important to know. Being arrays and hashes first-class
    XN> citizens it kind of surprises to newcomers (like me when I learned
    XN> Perl5) that they cannot be nested. The class about structures should
    XN> have some comment like "You see we have all these high-level
    XN> structures so easily handled at the tips of our fingers, but because
    XN> of <<historical reasons>> they cannot be nested. We'll learn how to do
    XN> that when we see references."

    then you need to learn some c. there is no easy way to truly nest stuff
    without pointers in c. any tree of mixed structures must use
    pointers. so now it comes to translating that to perl. how would you
    assign a hash to a scalar slot? currently a hash or an array in a scalar
    context (and this is mostly true in perl4) returns its size. do you make
    a full copy during the assignment? how do you handle looped data? with
    full copies and no references/pointers you can't have data loops. how
    would you pass things around to subs, again with full copies? what does
    it mean to assign an array which has arrays to another array? does it do
    a flattening or a deep copy? you have many more questions like this to
    answer and most of the answers suck for either efficiency reasons or
    behavioral ones. trust me, larry knows what he is doing and by making
    trees require refs he chose a good path. it meant very clean
    compability, it made semantics clean and easy to explain. the only issue
    is that it is a little harder for newbies to pick up the concepts of
    refs in trees. but as with much of perl, it may be harder the first time
    to learn it, but the payoff is massive time savings later for
    experienced hackers. perl is meant to save development time, not be a
    sop to newbies who want nested structures without having to think about
    things and all the ugliness they have.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Apr 12, 2004
    #9
  10. On Sun, 11 Apr 2004 15:34:55 +0100, pkent <>
    wrote:

    >c) hashes and arrays may contain other hashes or arrays by holding
    >references to them


    Definitely correct IMHO (see e.g. my other post in this thread) once
    s/may contain/may (fake very nicely to) contain/;


    Michele
    --
    you'll see that it shouldn't be so. AND, the writting as usuall is
    fantastic incompetent. To illustrate, i quote:
    - Xah Lee trolling on clpmisc,
    "perl bug File::Basename and Perl's nature"
     
    Michele Dondi, Apr 13, 2004
    #10
  11. Xavier Noria

    Guest

    (Xavier Noria) wrote:
    > Uri Guttman <> wrote in message
    > news:<>...
    >
    > > and even without that, it makes very good sense. the problem with
    > > storing a real hash where a scalar is, is how do you store it? the slot
    > > in an SV can hold a single item (a scalar) so what would you put there
    > > to represent a hash? and if any of those hash elements was a hash, all
    > > memory hell breaks out. in c, you can only do multidim arrays of known
    > > element size. with perl you can have each thing at any level be any
    > > thing of any size. so the win is major flexibility at a cost of
    > > understanding and dealing with refs. not a bad tradeoff IMO.

    >
    > In my opinion the reason cannot be only "because the slot is an SV".


    You are right, they could have changed that if they wanted to. Or just
    papered over it in the parser and left it the same behind the scenes.

    > Why then arrays and hashes are data types that cannot be stored in
    > SVs? I guess there was some choice made when those data types were
    > defined that matters here. My question is why that initial choice was
    > done like that.


    How would you assign a hash or an array to a scalar?

    $x=@a is already used to mean something different.
    ($x)=@a is already used to mean something different.

    What notation would you use?

    When it comes to dereferencing, what notation would you use?
    And you have to use some notation, because sometimes I want
    a shallow copy and sometimes I want a deep copy, so you have to give
    me the power to declare which one I want. I guess they could have
    made dereferencing the default behavior (and deny that that is what they
    are doing, by decreeing that they weren't references in the firts place),
    and you would instead need a special notation for a non-dereferencing
    access. But regardless of which one is the default, the other one is still
    necessary as an option. Or would you forbid the whole concept of multiple
    handles into the same piece of data?

    > Efficiency?


    I doubt it. Behind the scenes it probably be pretty much the same.
    (Unless you did forbid the concept of multiple handles into the same
    piece of data.)

    > No particular reason but historical
    > accident? Different goals than today for which those types were better
    > suited?


    Well, until I see you give a grammar/syntax which allows us to accomplish
    everything we can currently accomplish, I'll stick with the notion that
    they didn't do it because it is a bad idea.

    > I think this is important to know. Being arrays and hashes first-class
    > citizens it kind of surprises to newcomers (like me when I learned
    > Perl5) that they cannot be nested. The class about structures should
    > have some comment like "You see we have all these high-level
    > structures so easily handled at the tips of our fingers, but because
    > of <<historical reasons>> they cannot be nested. We'll learn how to do
    > that when we see references."


    "Nested structures have a certain irreducible complexity, and you ignore
    this complexity at your own peril. We need to thoroughly understand the
    data structures themselves before we delve into nesting them. We will
    learn how to appropriately deal with this complexity when we learn about
    references."

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Apr 13, 2004
    #11
  12. Xavier Noria

    Xavier Noria Guest

    Thank you very much for your response Xho!

    Nevertheless both your post and Uri's seem to answer "why structures
    cannot be nested in Perl 5". That's helpful, but it is not the real
    question. The argument more or less goes: "Those semantics in Perl 5
    are the most reasonable choice because otherwise how could you achieve
    backwards compatibility?".

    But since that's a consequence of history, to answer the question "why
    structures cannot be nested today" we have in turn to answer why they
    didn't nest in previous versions of Perl. Why they didn't nest from
    the start being first-class citizens and containers. I don't mean to
    read Larry's mind, but maybe somebody around just know it.

    -- fxn
     
    Xavier Noria, Apr 14, 2004
    #12
  13. Xavier Noria () wrote:
    : When I introduce references the first thing I mention is that they
    : allow us to build nested structures. However, the importance of that
    : feature is a consequence of the fact that structures cannot be nested
    : themselves.

    : Does anybody know why structures were designed so that they could just
    : hold scalars?

    It's conceptually simple, it's consistent, it allows for all the necessary
    functionality of nested data structures, and the references themselves are
    a general purpose mechanism that provides a lot more than just nested
    structures.

    Anything else would be more complicated in virtually every situation other
    than doing deep copies of nested structures.
     
    Malcolm Dew-Jones, Apr 14, 2004
    #13
  14. Xavier Noria

    Juha Laiho Guest

    (Xavier Noria) said:
    >When I introduce references the first thing I mention is that they
    >allow us to build nested structures. However, the importance of that
    >feature is a consequence of the fact that structures cannot be nested
    >themselves.
    >
    >Does anybody know why structures were designed so that they could just
    >hold scalars?


    Making guesses:
    - space management for structures becomes easier
    - allows for more complex data structures - f.ex. to have structures
    A, B and C so that both B and C refer to a single instance of
    structure A (so, if you change something in the 'A' referred to in
    structure 'B', the same change is seen through 'C')

    --
    Wolf a.k.a. Juha Laiho Espoo, Finland
    (GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
    PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
    "...cancel my subscription to the resurrection!" (Jim Morrison)
     
    Juha Laiho, Apr 25, 2004
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben Holness

    Hashes of Hashes via subs

    Ben Holness, Oct 5, 2003, in forum: Perl
    Replies:
    8
    Views:
    570
    Ben Holness
    Oct 8, 2003
  2. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,004
    Smokey Grindel
    Dec 2, 2006
  3. Tim O'Donovan

    Hash of hashes, of hashes, of arrays of hashes

    Tim O'Donovan, Oct 27, 2005, in forum: Perl Misc
    Replies:
    5
    Views:
    217
  4. Replies:
    3
    Views:
    210
  5. PerlFAQ Server
    Replies:
    0
    Views:
    120
    PerlFAQ Server
    Apr 21, 2011
Loading...

Share This Page