No more than N element of an array

Discussion in 'Perl Misc' started by Tim McDaniel, Jul 25, 2013.

  1. Tim McDaniel

    Tim McDaniel Guest

    A cow-orker is coding something: he gets an array of results, but
    wants to take no more than the first 1000 elements. His suggestion was

    @results = @results[0 .. (MAXSEARCHRESULTS - 1, $#results)[MAXSEARCHRESULTS - 1 > $#results ]];

    He felt it was "awesome". I thought it was way too cute.

    Of course there's
    @results = @results[0 .. (MAXSEARCHRESULTS - 1 > $#results ? $#results : MAXSEARCHRESULTS - 1)];
    It has the same number of uses of variables, just in a different order.

    I considered plain @results[0 .. MAXSEARCHRESULTS]. It works if the
    array is longer than the limit, but pads with undef if shorter:

    $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6); @a = @a[0..10]; print join(",", @a), "\n"'
    Use of uninitialized value $a[7] in join or string at -e line 1.
    Use of uninitialized value $a[8] in join or string at -e line 1.
    Use of uninitialized value $a[9] in join or string at -e line 1.
    Use of uninitialized value $a[10] in join or string at -e line 1.
    0,1,2,3,4,5,6,,,,

    I thought of splice.

    $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6); splice @a, 3; print join(",", @a), "\n"'
    0,1,2
    $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6); splice @a, 10; print join(",", @a), "\n"'
    splice() offset past end of array at -e line 1.
    0,1,2,3,4,5,6
    $ perl -e 'print $], "\n"'
    5.014002

    But that's on my ISP. On our office machines, no warning.

    $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6); splice @a, 3; print join(",", @a), "\n"'
    0,1,2
    $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6); splice @a, 10; print join(",", @a), "\n"'
    0,1,2,3,4,5,6
    $ perl -e 'print $], "\n"'
    5.016002
    $ perlfunc splice | cat
    ....
    If OFFSET is past the end of the array,
    Perl issues a warning, and splices at the end of the array.
    ....

    So is splice likely to ever output such a warning again, so the 5.16
    documentation is simply outdated? Or was that just an error in 5.16
    and the warning now comes out in 5.18?

    To protect against that,
    splice @results, MAXSEARCHRESULTS if @results > MAXSEARCHRESULTS;
    Doable (warning: that's untested code). But it's not as pretty.

    Is there a cleaner way?

    (As it happens, he now has to check the limit in an if anyway for
    another purpose, so he knows that @results[0 .. MAXSEARCHRESULTS - 1]
    won't go past the end, so the question is now moot. I'm still
    curious.)

    --
    Tim McDaniel,
    Tim McDaniel, Jul 25, 2013
    #1
    1. Advertising

  2. (Tim McDaniel) writes:
    > A cow-orker is coding something: he gets an array of results, but
    > wants to take no more than the first 1000 elements. His suggestion was
    >
    > @results = @results[0 .. (MAXSEARCHRESULTS - 1, $#results)[MAXSEARCHRESULTS - 1 > $#results ]];
    >
    > He felt it was "awesome". I thought it was way too cute.


    I suggest 'awful' instead:

    $#results = MAXSEARCHRESULTS - 1 unless $#results < MAXSEARCHRESULTS;
    Rainer Weikusat, Jul 25, 2013
    #2
    1. Advertising

  3. Rainer Weikusat <> writes:
    > (Tim McDaniel) writes:
    >> A cow-orker is coding something: he gets an array of results, but
    >> wants to take no more than the first 1000 elements. His suggestion was
    >>
    >> @results = @results[0 .. (MAXSEARCHRESULTS - 1, $#results)[MAXSEARCHRESULTS - 1 > $#results ]];
    >>
    >> He felt it was "awesome". I thought it was way too cute.

    >
    > I suggest 'awful' instead:
    >
    > $#results = MAXSEARCHRESULTS - 1 unless $#results < MAXSEARCHRESULTS;


    In case someone loves fancy calculations, this can also be written as

    $#results -= (@results - MAXSEARCHRESULTS) * (@results > MAXSEARCHRESULTS);

    :->
    Rainer Weikusat, Jul 25, 2013
    #3
  4. On 7/25/2013 9:12 PM, Ben Morrow wrote:
    >
    > Quoth :
    >> A cow-orker is coding something: he gets an array of results, but
    >> wants to take no more than the first 1000 elements. His suggestion was
    >>
    >> @results = @results[0 .. (MAXSEARCHRESULTS - 1,
    >> $#results)[MAXSEARCHRESULTS - 1 > $#results ]];
    >>
    >> ...
    >> $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6);
    >> splice @a, 3; print join(",", @a), "\n"'
    >> 0,1,2
    >> $ perl -e 'use strict; use warnings; my @a = (0, 1, 2, 3, 4, 5, 6);
    >> splice @a, 10; print join(",", @a), "\n"'

    >it's not as pretty.
    >> ...
    >> Is there a cleaner way?

    >
    > Perhaps
    >
    > @results = splice @results, 0, MAXSEARCHRESULTS;
    >


    If fewer strokes were to factor into cleanliness, even:

    @results = @results[0..MAXSEARCHRESULTS];


    But, as you grow array size and MAXSEARCHRESULTS, it gets filthy slow...

    Setting $#results = MAXSEARCHRESULTS undoubtedly comes out of the wash
    purest and fastest.

    --
    Charles DeRykus
    Charles DeRykus, Jul 26, 2013
    #4
  5. On 7/26/2013 1:00 AM, Ben Morrow wrote:
    >
    > Quoth Charles DeRykus <>:
    >> On 7/25/2013 9:12 PM, Ben Morrow wrote:
    >>>
    >>> Quoth :
    >>>> A cow-orker is coding something: he gets an array of results, but
    >>>> wants to take no more than the first 1000 elements. His suggestion was

    > [...]
    >>>> Is there a cleaner way?
    >>>
    >>> Perhaps
    >>>
    >>> @results = splice @results, 0, MAXSEARCHRESULTS;

    >>
    >> If fewer strokes were to factor into cleanliness, even:
    >>
    >> @results = @results[0..MAXSEARCHRESULTS];

    >
    > Tim already pointed out that this returns extraneous undefs if @results
    > is too short.


    Sigh, I missed it.

    You could tweak it via with [0..min($#results,MAXSEARCRESULTS)] but,
    aside from the purist's objection of adding a module, it should be DOA
    anyway with its inefficiency (I'm guessing that it does more copying so
    is slower).

    >
    >> But, as you grow array size and MAXSEARCHRESULTS, it gets filthy slow...
    >>
    >> Setting $#results = MAXSEARCHRESULTS undoubtedly comes out of the wash
    >> purest and fastest.

    >
    > It's probably easiest, though turning off the warning and using Tim's
    >
    > splice @results, MAXSEARCHRESULTS - 1;
    >
    > is probably better, on balance.


    Turning off a warning category seems slightly unclean to me... even
    though it's only because of the earlier version.


    Using $#ary as an lvalue has some
    > permanent side-effects on the array; you can see them with Devel::peek.
    >
    > [The side-effects are to do with the fact that $#ary is a scalar lvalue
    > and \$#ary should return a ref to the same scalar every time, so we need
    > an actual permanent scalar somewhere, which turns out to get stored in
    > the array's magic.]
    >


    Interesting. IIUC any real downside other than the extra storage in
    magic? I thought I remembered truncating via $#ary doesn't return the
    memory to the process unlike undef @ary.

    --
    Charles DeRykus
    Charles DeRykus, Jul 26, 2013
    #5
  6. On 2013-07-26 09:19, Charles DeRykus <> wrote:
    > On 7/26/2013 1:00 AM, Ben Morrow wrote:
    >> Quoth Charles DeRykus <>:
    >>> @results = @results[0..MAXSEARCHRESULTS];

    >>
    >> Tim already pointed out that this returns extraneous undefs if @results
    >> is too short.

    >
    > Sigh, I missed it.
    >
    > You could tweak it via with [0..min($#results,MAXSEARCRESULTS)] but,
    > aside from the purist's objection of adding a module, it should be DOA
    > anyway with its inefficiency (I'm guessing that it does more copying so
    > is slower).


    Why should @results[0..min($#results,MAXSEARCRESULTS)] do more copying
    than @results[0..MAXSEARCHRESULTS]?

    hp


    --
    _ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
    |_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
    | | | | die Satzbestandteile des Satzes nicht mehr
    __/ | http://www.hjp.at/ | zusammenpaƟt. -- Ralph Babel
    Peter J. Holzer, Jul 26, 2013
    #6
  7. On 7/26/2013 2:32 AM, Peter J. Holzer wrote:
    > On 2013-07-26 09:19, Charles DeRykus <> wrote:
    >> On 7/26/2013 1:00 AM, Ben Morrow wrote:
    >>> Quoth Charles DeRykus <>:
    >>>> @results = @results[0..MAXSEARCHRESULTS];
    >>>
    >>> Tim already pointed out that this returns extraneous undefs if @results
    >>> is too short.

    >>
    >> Sigh, I missed it.
    >>
    >> You could tweak it via with [0..min($#results,MAXSEARCRESULTS)] but,
    >> aside from the purist's objection of adding a module, it should be DOA
    >> anyway with its inefficiency (I'm guessing that it does more copying so
    >> is slower).

    >
    > Why should @results[0..min($#results,MAXSEARCRESULTS)] do more copying
    > than @results[0..MAXSEARCHRESULTS]?
    >


    The min(..) tweak was to eliminate the padding with undef that occurs if
    MAXSEARCHRESULTS > $#results. In general, I was referring to a
    supposition (guessing as I put it) that @results = @results[...] will
    likely do more copying than say splice(@results,MAXSEARCHRESULTS). At
    any rate, it's slower.

    --
    Charles DeRykus
    Charles DeRykus, Jul 26, 2013
    #7
  8. Tim McDaniel

    Tim McDaniel Guest

    In article <kstev5$71k$>,
    Charles DeRykus <> wrote:
    >You could tweak it via with [0..min($#results,MAXSEARCRESULTS)]


    There's a min sub somewhere?

    $ perl -e 'my $a = min(5,8); print $a, "\n"'
    Undefined subroutine &main::min called at -e line 1.

    Since this is my ork-place, I don't have control of modules.

    --
    Tim McDaniel,
    Tim McDaniel, Jul 26, 2013
    #8
  9. Tim McDaniel

    Ivan Shmakov Guest

    min (), and Perl modules

    >>>>> Tim McDaniel <> writes:
    >>>>> Charles DeRykus <> wrote:


    >> You could tweak it via with [0..min($#results,MAXSEARCRESULTS)]


    > There's a min sub somewhere?


    > $ perl -e 'my $a = min(5,8); print $a, "\n"'
    > Undefined subroutine &main::min called at -e line 1.


    > Since this is my ork-place, I don't have control of modules.


    How so? Doesn't $ export PERLLIB="$HOME"/.perl/modules help,
    for instance?

    --cut: ~/.bash_profile --
    CPAN=${HOME}/.cpan
    cpan_pfx=${CPAN}/prefix
    perl_ver=5.14.2
    PERLLIB=${cpan_pfx}/lib/perl/5.14.2\
    :${cpan_pfx}/share/perl/5.14.2\
    :${cpan_pfx}/lib/perl5\
    :${cpan_pfx}/lib/perl\
    :${cpan_pfx}/share/perl5\
    :${cpan_pfx}/lib/perl/5.14\
    :${cpan_pfx}/share/perl/5.14\
    :${cpan_pfx}/lib/perl5/x86_64-linux-gnu-thread-multi

    export CPAN PERLLIB
    --cut: ~/.bash_profile --

    --
    FSF associate member #7257
    Ivan Shmakov, Jul 26, 2013
    #9
  10. Ben Morrow <> writes:
    > Quoth Charles DeRykus <>:
    >> On 7/25/2013 9:12 PM, Ben Morrow wrote:


    [...]

    >> Setting $#results = MAXSEARCHRESULTS undoubtedly comes out of the wash
    >> purest and fastest.

    >
    > It's probably easiest, though turning off the warning and using Tim's
    >
    > splice @results, MAXSEARCHRESULTS - 1;
    >
    > is probably better, on balance. Using $#ary as an lvalue has some
    > permanent side-effects on the array; you can see them with Devel::peek.
    >
    > [The side-effects are to do with the fact that $#ary is a scalar lvalue
    > and \$#ary should return a ref to the same scalar every time, so we need
    > an actual permanent scalar somewhere, which turns out to get stored in
    > the array's magic.]


    While there is little reason to prefer one or the other, I
    nevertheless want to make an argument in favor of assigning to $#ary:

    'Splicing' usually refers to connecting things together. This can
    still be seen in the '4 argument splice' which 'works' the contents of
    a list into an array. It is a more general 'array element manipulation
    operator' in Perl but statements like

    "splice(@a, @a, 0, $x, $y) is equivalent to push(@a, $x, $y)"
    [paraphrase of a part of 'perldoc -f splice']

    remind me in an uncanny way of something I read about 'lambda
    calculus' a while ago: The statement basically was "This (short string
    of incomprehensible symbols) can be simplified to that (extremely long
    string of incomprehensible symbols)". At this point, I concluded that
    either me or the author of the text had obviously lost the plot and
    that I'd rather continue to live in my own little world than endure
    more relevations of this kind. The splice-operation I quoted above is
    similar to this, expressing a relatively simple 'well-known' operation
    in a more complicated way than necessary by invoking splice with two
    additional arguments (compared to push) in order to work around the
    'actual' semantics of the 4-argument splice, namely, replace some run
    of array elements with a run of other "datas" (datums?). Making the
    simple appear complicated may be good for achieving a "Wow!" effect
    but it isn't a good strategy for software: Things tend to get
    complicated on their own and the more complicated the simple stuff
    already is, the less complicated the system as a whole can become
    before it collapses under its own weight. That splice can be made to
    perform many different array manipulation tasks IMO means it is
    ill-defined and should usually be avoided. For the case at hand,
    namely, truncating an array without knowing if it needs to be
    truncated, the 'clever way to use splice' is actually so 'ill' that
    perl even issues a warning for it. While I don't usually use Perl
    runtime warnings and would recommend to disregard them most of the
    time, this is at least a clear hint that someone considered this to be
    a rather bizarre way to express a particular operation.

    In contrast to this, 'assigning to $#ary' has the defined meaning of
    'change the length of the array', maybe in a way peculiar to Perl (I
    don't know of anything similar in another language) but "Perl written
    in a way peculiar to Perl" is, in my opinion, not generally a bad
    thing, more so if this means 'the code becomes simpler'. I do
    consider

    $#a = SOMETHING - 1 unless $#a < SOMETHING;

    simpler than the more 'elegant' splice(@a, SOMETHING) which
    performs the same 'Do we actually need to do something?' check
    internally as part of validating its arguments, because it plainly
    states the intent of the code: Get rid of the excess elements unless
    there aren't any.

    I don't think that 'But my arrays get the measles when I do that!'
    is a valid counterargument: The magic scalar needs to be created in
    order to perform this operation via assignment and once it has been
    created, it makes sense to keep it around unless memory is very tight:
    If the array is short-lived, freeing it at the same time its container
    dies instead of immediately won't make a noticeable difference but if
    it is long-lived, it will likely be needed again, at least because
    this codepath will be taken again, and then, the proxy scalar doesn't
    need to be created again.
    Rainer Weikusat, Jul 26, 2013
    #10
  11. Tim McDaniel

    Tim McDaniel Guest

    Re: min (), and Perl modules

    In article <>,
    Ivan Shmakov <> wrote:
    > Tim McDaniel <> writes:
    > > Since this is my ork-place, I don't have control of modules.

    >
    > How so? Doesn't $ export PERLLIB="$HOME"/.perl/modules help,
    > for instance?


    This is not for my personal laptop. This is for production code on a
    Web server. I can check in new files, so I could download some module
    and put it in our section of the tree. But they have reasonable
    suspicions of third-party code and it's generally better to use
    builtins.

    Or, of course, it would be easy for me to code a min sub.

    --
    Tim McDaniel,
    Tim McDaniel, Jul 26, 2013
    #11
  12. Ben Morrow <> writes:
    > Quoth Rainer Weikusat <>:
    >> Ben Morrow <> writes:
    >> > Quoth Charles DeRykus <>:

    >>
    >> >> Setting $#results = MAXSEARCHRESULTS undoubtedly comes out of the wash
    >> >> purest and fastest.
    >> >
    >> > It's probably easiest, though turning off the warning and using Tim's
    >> >
    >> > splice @results, MAXSEARCHRESULTS - 1;
    >> >
    >> > is probably better, on balance. Using $#ary as an lvalue has some
    >> > permanent side-effects on the array; you can see them with Devel::peek.

    >>
    >> While there is little reason to prefer one or the other, I
    >> nevertheless want to make an argument in favor of assigning to $#ary:
    >>
    >> 'Splicing' usually refers to connecting things together. This can
    >> still be seen in the '4 argument splice' which 'works' the contents of
    >> a list into an array.

    >
    > 'Splice' is not an ideal name for the operation; however, it's no worse
    > than 'substr', which is exactly the same operator on strings.


    They're sort-of the inverse of each other in this respect: The
    4-argument splice actually splices something (in a sense at least -- I
    figure that whoever invented this name wasn't a conscript mariner in
    some navy :), the other three don't: They seem to have 'grown' on the
    splice implementation because it happened to be a suitable environment
    for them. For substr, the same three cases actually extract substrings
    from a string while the 4-argument one does something different.

    >> It is a more general 'array element manipulation
    >> operator' in Perl but statements like
    >>
    >> "splice(@a, @a, 0, $x, $y) is equivalent to push(@a, $x, $y)"
    >> [paraphrase of a part of 'perldoc -f splice']

    > [...]
    >> The splice-operation I quoted above is
    >> similar to this, expressing a relatively simple 'well-known' operation
    >> in a more complicated way than necessary by invoking splice with two
    >> additional arguments (compared to push) in order to work around the
    >> 'actual' semantics of the 4-argument splice, namely, replace some run
    >> of array elements with a run of other "datas" (datums?). Making the
    >> simple appear complicated may be good for achieving a "Wow!" effect
    >> but it isn't a good strategy for software:

    >
    > Did it occur to you that this was not intended to explain what 'push'
    > does, but rather to help explain what 'splice' does?


    I think it was intended to explain what splice can be made to do by
    feeding 'cleverly selected arguments' to it.

    > I would agree with you that the push is a simpler expression than
    > the splice, but for example the similar equivalence
    >
    > shift(@a) splice(@a, 0, 1)
    >
    > shows you how to use splice to shift multiple elements at once.


    Conceptually, the Perl splice can be thought of as a combination of
    two 'primitive operations', namely a

    remove(@array, $offset, $length)

    which removes @array[$offset .. $offset + $length - 1] from @array and
    a

    insert(@array, $offset, @list)

    which inserts the elements on @list into @array starting at offset
    $offset. The latter can be expressed in Perl as

    splice(@array, $offset, 0, @list)

    (another 'clever abuse'). There doesn't seem to be any good reason for
    combining both in this way except that this means the implementation
    can be 'smart' wrt changing the array length for the 'splicing'
    splice case. Where Perl provides another 'built-in' way to perform a
    particular array manipulation, ie, shift/unshift, push/pop and
    assignment to $#array for truncation, the 'generic splice' should IMHO
    be avoided because of this 'optimized combo-opness'.
    Rainer Weikusat, Jul 30, 2013
    #12
  13. On 7/30/2013 9:52 AM, Rainer Weikusat wrote:
    ....
    >
    > Conceptually, the Perl splice can be thought of as a combination of
    > two 'primitive operations', namely a
    >
    > remove(@array, $offset, $length)
    >
    > which removes @array[$offset .. $offset + $length - 1] from @array and
    > a
    > ...


    "Splicing" in a bit of a tangent here... the unsuspecting might think,
    however briefly, that 'delete' on an array could pinch hit for 'remove'
    above. Except that 'delete' on arrays was DWIM-challenged and almost
    never what you wanted.

    'delete' on arrays has been euthanized, ie, 'deprecated' (which is
    euthanasia...just dragging it on for years).

    A remove/zap/delete for arrays though would fill the gap nicely, maybe a
    List::Util function that, passed an array and an indices list, eg, some
    faster equivalent of:

    sub zap(+@) { die "not an array ref unless ref $_[0] eq 'ARRAY';
    splice( $_[0], $_ ,1 ) for reverse sort @_[1..$#_]; }

    called like, eg, zap( @results, MAXSEARCHRESULTS..$#results)


    --
    Charles DeRykus
    Charles DeRykus, Jul 31, 2013
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fendi Baba
    Replies:
    0
    Views:
    483
    Fendi Baba
    Oct 21, 2003
  2. Aray
    Replies:
    1
    Views:
    523
  3. Bob Rashkin
    Replies:
    5
    Views:
    59
    Dennis Lee Bieber
    Dec 23, 2013
  4. Steven D'Aprano
    Replies:
    0
    Views:
    65
    Steven D'Aprano
    Dec 23, 2013
  5. Replies:
    3
    Views:
    62
    Gary Herron
    Dec 23, 2013
Loading...

Share This Page