split is not convinient

Discussion in 'Perl Misc' started by Todd, Dec 12, 2007.

  1. Todd

    Todd Guest

    Hi,


    When I use split, there are serveral case frequently occured:

    case1:

    split # default to split ' ', $_

    case2:

    split /xxx/ # default to split /xxx/, $_

    case3:

    split ' ', expr # ????


    I do think case3 is most frequently used, but why i've to type ' '
    every time, why there are no simpler syntax to use in this case?


    Thanks,
    Todd
     
    Todd, Dec 12, 2007
    #1
    1. Advertising

  2. >>>>> "Todd" == Todd <> writes:

    Todd> case3:

    Todd> split ' ', expr # ????


    Todd> I do think case3 is most frequently used, but why i've to type ' '
    Todd> every time, why there are no simpler syntax to use in this case?

    Because split is positional, and it's only 4 characters to type that, and it's
    not *that* common.

    print "Just another Perl hacker,"; # the original

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
    See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
     
    Randal L. Schwartz, Dec 12, 2007
    #2
    1. Advertising

  3. Todd <> wrote in news:20d30e4a-8ded-4a3a-a99b-
    :

    > case2:
    >
    > split /xxx/ # default to split /xxx/, $_
    >
    > case3:
    >
    > split ' ', expr # ????
    >
    >
    > I do think case3 is most frequently used, but why i've to type ' '
    > every time, why there are no simpler syntax to use in this case?



    How would you decide whether the following is case 1 or case 2?

    split 'hello';

    The designers of the API made a choice that if only one argument is
    specified, then it will be interpreted as the pattern.

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)
    clpmisc guidelines: <URL:http://www.augustmail.com/~tadmc/clpmisc.shtml>
     
    A. Sinan Unur, Dec 12, 2007
    #3
  4. On Wed, 12 Dec 2007 08:07:57 -0800, Todd wrote:
    > split ' ', expr # ????
    >
    >
    > I do think case3 is most frequently used, but why i've to type ' ' every
    > time, why there are no simpler syntax to use in this case?


    AFAIK the perl built-in functions that have default arguments always only
    allow you to remove the right-most argument(s).

    That would make it a case of which of these in general are used the most
    (written in full for clarity):

    split /some pattern/,$_;

    or

    split ' ',$some_string;

    And I can definitely tell you I use the former much more often than the
    latter.

    For instance:

    while (<STDIN>) {
    chomp;
    my ($n,$v) = split /=/;
    }

    I probably use split /=/; already more often than I use split; split ' ',
    $string and split/ /,$string; combined.

    Joost.
     
    Joost Diepenmaat, Dec 12, 2007
    #4
  5. Todd wrote:
    > When I use split, there are serveral case frequently occured:
    > case1:
    > split # default to split ' ', $_
    >
    > case2:
    > split /xxx/ # default to split /xxx/, $_
    >
    > case3:
    > split ' ', expr # ????
    >
    > I do think case3 is most frequently used, but why i've to type ' '
    > every time, why there are no simpler syntax to use in this case?


    One argument not being mentioned by others so far is that when a Perl
    function allows omitting of an argument and thus substituting a default
    argument it will be $_. Thisis the typical and expected Perl behaviour.
    Interpreting a single argument as the text string rather than the search
    pattern would be unexpected and contradictory to the typical behaviour of
    all other Perl functions.

    jue
     
    Jürgen Exner, Dec 12, 2007
    #5
  6. Todd

    Guest

    Todd <> wrote:
    > Hi,
    >
    > When I use split, there are serveral case frequently occured:
    >
    > case1:
    >
    > split # default to split ' ', $_
    >
    > case2:
    >
    > split /xxx/ # default to split /xxx/, $_
    >
    > case3:
    >
    > split ' ', expr # ????
    >
    > I do think case3 is most frequently used,


    Not by me. When I use split on something other than $_, I almost
    always use a pattern other than ' ', which I guess would be case4.

    After a brief grep through some code, it looks like the order from most
    used to least used would be case2 > case1 > case4 > case3

    > but why i've to type ' '
    > every time, why there are no simpler syntax to use in this case?



    You could define a subroutine name splitd (d for default, as in ' ' being
    the default pattern)

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.
     
    , Dec 12, 2007
    #6
  7. Todd

    Ben Morrow Guest

    Quoth "Jürgen Exner" <>:
    >
    > One argument not being mentioned by others so far is that when a Perl
    > function allows omitting of an argument and thus substituting a default
    > argument it will be $_. Thisis the typical and expected Perl behaviour.
    > Interpreting a single argument as the text string rather than the search
    > pattern would be unexpected and contradictory to the typical behaviour of
    > all other Perl functions.


    Uh huh... like shift, or readline, or the return value of split in
    scalar context? There are many things I like about Perl; consistency is
    not one of them.

    :)

    Ben
     
    Ben Morrow, Dec 12, 2007
    #7
  8. Ben Morrow wrote:
    > Quoth "Jürgen Exner" <>:
    >>
    >> One argument not being mentioned by others so far is that when a Perl
    >> function allows omitting of an argument and thus substituting a
    >> default argument it will be $_. Thisis the typical and expected Perl
    >> behaviour. Interpreting a single argument as the text string rather
    >> than the search pattern would be unexpected and contradictory to the
    >> typical behaviour of all other Perl functions.

    >
    > Uh huh... like shift,


    Is not defined on scalars. When argument is omitted it shifts @_ which is as
    close as possible to $_

    > or readline,


    The only argument is mandatory, you cannot omit it and thus cannot default
    to $_.

    > or the return value of split in scalar context?


    Doesn't apply. We are talking about optional function/operator _arguments_
    which when omitted will default to $_. Return values are not arguments.

    jue
     
    Jürgen Exner, Dec 12, 2007
    #8
  9. Todd

    Ben Morrow Guest

    Quoth "Jürgen Exner" <>:
    > Ben Morrow wrote:
    >
    > > or readline,

    >
    > The only argument is mandatory, you cannot omit it and thus cannot default
    > to $_.


    Yes. Sorry, that was a thinko; I meant <> and thought that readline had
    the same default behaviour. To that list we can add -t, 1-arg open,
    close (indeed, lots of the IO functions, including the mess that is
    eof), substr, caller, bless, die/warn, exit, localtime, ...

    I'm not saying any of these defaults are not sensible and useful, just
    that they're far from consistent.

    Ben
     
    Ben Morrow, Dec 12, 2007
    #9
  10. On Wed, 12 Dec 2007 23:00:28 +0000, Ben Morrow wrote:

    > Quoth "Jürgen Exner" <>:
    >> Ben Morrow wrote:
    >>
    >> > or readline,

    >>
    >> The only argument is mandatory, you cannot omit it and thus cannot
    >> default to $_.

    >
    > Yes. Sorry, that was a thinko; I meant <> and thought that readline had
    > the same default behaviour.


    Don't forget while(<>) { ... } :)

    > To that list we can add -t


    which for /more-or-less/ obvious reasons defaults to _

    > 1-arg open, close (indeed, lots of the IO functions, including the mess

    that is
    > eof), substr, caller, bless, die/warn, exit, localtime, ...
    >
    > I'm not saying any of these defaults are not sensible and useful, just
    > that they're far from consistent.


    You're definitely right there.

    Joost.
     
    Joost Diepenmaat, Dec 12, 2007
    #10
  11. On Wed, 12 Dec 2007 23:23:37 +0000, Joost Diepenmaat wrote:

    > which for /more-or-less/ obvious reasons defaults to _


    Arg. no it doesn't. For other obvious reasons.

    Joost.
     
    Joost Diepenmaat, Dec 12, 2007
    #11
  12. Todd

    Mark Seger Guest

    Todd wrote:
    > Hi,
    >
    >
    > When I use split, there are serveral case frequently occured:
    >
    > case1:
    >
    > split # default to split ' ', $_
    >
    > case2:
    >
    > split /xxx/ # default to split /xxx/, $_
    >
    > case3:
    >
    > split ' ', expr # ????
    >
    >
    > I do think case3 is most frequently used, but why i've to type ' '
    > every time, why there are no simpler syntax to use in this case?


    Maybe I've just been programming for too many years and looked at too
    much code written by people who like to take shortcuts, but I think your
    cases 1/2 should never be used! In fact I avoid any syntax that has a
    default such as "while (<>)" which is why I always use "while
    ($var=<>)". After all, you never know when someone is going to insert a
    line of code that blows away your default with a different one. In
    other words it's all about supportability!

    It's also common for someone to pick up a piece of code in a language
    they're not familiar with and try to figure out what it does with
    minimal effort and having something that is not obvious is a real pain.
    Are people that lazy that typing a few extra characters to make what
    they're doing more explicit too much of a burden? How is this any
    different from commenting code? Or don't people do that any more either?

    -mark
     
    Mark Seger, Dec 13, 2007
    #12
  13. Ben Morrow <> wrote:
    >
    > Quoth "Jürgen Exner" <>:
    >>
    >> One argument not being mentioned by others so far is that when a Perl
    >> function allows omitting of an argument and thus substituting a default
    >> argument it will be $_. Thisis the typical and expected Perl behaviour.
    >> Interpreting a single argument as the text string rather than the search
    >> pattern would be unexpected and contradictory to the typical behaviour of
    >> all other Perl functions.

    >
    > Uh huh... like shift, or readline, or the return value of split in
    > scalar context?



    Split has "dual defaults" for its 2 most frequent arguments. It
    uses $_ for one of them.

    My favorite (anti-favorite?) example of a non-$_ default is chdir().


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Dec 13, 2007
    #13
  14. Todd

    Todd Guest

    Mark Seger wrote:
    > Maybe I've just been programming for too many years and looked at too
    > much code written by people who like to take shortcuts, but I think your
    > cases 1/2 should never be used! In fact I avoid any syntax that has a
    > default such as "while (<>)" which is why I always use "while
    > ($var=<>)". After all, you never know when someone is going to insert a
    > line of code that blows away your default with a different one. In
    > other words it's all about supportability!
    >
    > It's also common for someone to pick up a piece of code in a language
    > they're not familiar with and try to figure out what it does with
    > minimal effort and having something that is not obvious is a real pain.
    > Are people that lazy that typing a few extra characters to make what
    > they're doing more explicit too much of a burden? How is this any
    > different from commenting code? Or don't people do that any more either?
    >
    > -mark


    Hi Mark:

    I doubted your "too" many year programming experience:

    while($var = <>) is not same as while (<>), you'll bitten by it some
    day.

    So why i need to care about other things you mentioned...

    -Todd
     
    Todd, Dec 13, 2007
    #14
  15. Todd

    Todd Guest

    Hi all,

    Seems nobody here mentioned the magic that
    split ' ', expr
    is same as
    split /\s+/, expr


    So let me google it and present some snipet codes from
    http://www.google.com/codesearch?q=lang:perl split ' '&btnG=Search&hl=en&lr=
    :


    www.cpan.org/authors/id/J/JS/JSIRACUSA/Rose-HTML-Objects-0.548.tar.gz
    - Perl -
    libwww/config/makedefs.pl - 69 identical

    67: while(<$INPUT>) {
    local($name) = split(" ",$_);
    &NumberDefs'numberEach($name);
    }


    ftp.mozilla.org/.../mozilla1.7b/src/mozilla-source-1.7b-
    source.tar.bz2 - Mozilla - Perl
    FAQ-OMatic-2.717/t-informal/split.pl - 6 identical


    58: my @words = split(/(\s+)/, $text);
    my $title = join('', splice(@words, 0, 13));


    www.cpan.org/authors/id/T/TT/TTY/kurila-0_02.tar.gz - Artistic -
    Perl - More from kurila-0_02.tar.gz >>
    perl5.00402-bindist04-bc/distfiles/perl5.004_02/t/op/split.t - 30
    identical

    33: $_ = join(':', 'foo', split(/ /,'a b c'), 'bar');
    if ($_ eq "foo:a:b::c:bar") {print "ok 7\n";} else {print
    "not ok 7 $_\n";}

    purl.oclc.org/.../OPENSRC/downloads/webutils/tars/HTML-
    MetaExtor-1.08.tar - Perl - Perl
    perl/lib/pod/Text.pm - 10 identical

    207: # $needspace = 0; # Assume this.
    # s/\n/ /g;
    ($Cmd, $_) = split(' ', $_, 2);
    # clear_noremap(1);


    -Todd
     
    Todd, Dec 13, 2007
    #15
  16. Todd wrote:
    >
    > Seems nobody here mentioned the magic that
    > split ' ', expr
    > is same as
    > split /\s+/, expr


    No it is not:

    $ perl
    -le'
    $_ = q[ a b c
    ];
    print q[split " "];
    print ">$_<" for split " ";
    print q[split /\s+/];
    print ">$_<" for split /\s+/;
    '
    split " "
    >a<
    >b<
    >c<

    split /\s+/
    ><
    >a<
    >b<
    >c<




    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Dec 13, 2007
    #16
  17. Mark Seger wrote:
    > Maybe I've just been programming for too many years and looked at too
    > much code written by people who like to take shortcuts, but I think
    > your cases 1/2 should never be used! In fact I avoid any syntax that
    > has a default such as "while (<>)" which is why I always use "while
    > ($var=<>)". After all, you never know when someone is going to
    > insert a line of code that blows away your default with a different
    > one. In other words it's all about supportability!
    >
    > It's also common for someone to pick up a piece of code in a language
    > they're not familiar with and try to figure out what it does with
    > minimal effort and having something that is not obvious is a real
    > pain. Are people that lazy that typing a few extra characters to
    > make what they're doing more explicit too much of a burden?


    I am somewhat torn. You certainly have a very valid point and taking
    shortcuts that lead to unreadable code is definitely not a good idea.

    On the other hand a lot of the elegance of Perl comes from its default "work
    space". It is so ingrained in the Perl world that I would almost call it a
    prerequisit for programming in Perl. If you don't know how to take advantage
    of $_ and @_ then maybe you shouldn't program in Perl or maintain Perl code.

    > How is
    > this any different from commenting code? Or don't people do that any
    > more either?


    Now that is actually comparing apples and oranges. I think you would agree
    that a comment like
    $index++ #increments $index by 1
    is nonsense. Correct, but still nonsense. Comments should describe _what_ is
    being done, not _how_ it is done. So the comment above should be something
    like "examine next personal file" or whatever action this increment
    indicates.

    Now, if you were using
    shift @_;
    instead of relying on the default as in
    shift;
    then you would just re-state information that is already there anyway, just
    like with the stupid comment above. If this shift moves on to the next
    personal file, then by all means comment it as such. But replacing shift;
    with shift @_; doesn't add any new information for the code maintainance
    person as long as he has at least a basic understanding of the language.

    jue
     
    Jürgen Exner, Dec 13, 2007
    #17
  18. Todd

    Todd Guest

    John W. Krahn wrote:
    > Todd wrote:
    > > Seems nobody here mentioned the magic that
    > > split ' ', expr
    > > is same as
    > > split /\s+/, expr

    >
    > No it is not:
    >
    > $ perl


    #! /bin/perl -l

    $_ = q[ a b c];

    $,=q/,/;
    print split ' ';
    print split /\s+/;

    __END__

    a,b,c
    ,a,b,c

    Thanks John for pointing out this, so split ' ' is more natural than
    split /\s+/.

    -Todd
     
    Todd, Dec 13, 2007
    #18
  19. Jürgen Exner <> wrote:
    > Mark Seger wrote:


    >> In fact I avoid any syntax that
    >> has a default such as "while (<>)" which is why I always use "while
    >> ($var=<>)".



    I agree completely with this one.

    Actually, it should most likely be:

    while ( my $var = <> )


    >> Are people that lazy that typing a few extra characters to
    >> make what they're doing more explicit too much of a burden?



    Many are. I am not one of them though. :)


    > I am somewhat torn. You certainly have a very valid point and taking
    > shortcuts that lead to unreadable code is definitely not a good idea.



    Understanding the code is not the big reason for avoiding while (<>).

    Stomping on $_ without even local()izing it is the big reason IMO, it
    can lead to classic "action at a distance".


    >> How is
    >> this any different from commenting code? Or don't people do that any
    >> more either?

    >
    > Now that is actually comparing apples and oranges. I think you would agree
    > that a comment like
    > $index++ #increments $index by 1
    > is nonsense.



    It makes perfect sense.


    > Correct, but still nonsense.



    Correct, but without value

    I could buy that one though. Or even


    Correct, but with _negative_ value

    (because now the code must be kept in sync with the comment)


    > Comments should describe _what_ is
    > being done,



    I think you are misremembering the classic advice regarding comments.

    The code already describes what is being done. Saying the same thing
    twice does not add any value.

    Comments should describe _why_ it is being done.


    > not _how_ it is done. So the comment above should be something
    > like "examine next personal file" or whatever action this increment
    > indicates.



    That comment is a "why" rather than a "what".

    Q: Why is the programmer incrementing $index

    A: To move on to processing the next personal file


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Dec 13, 2007
    #19
  20. On Thu, 13 Dec 2007 06:09:38 -0600, Tad J McClellan wrote:

    > Jürgen Exner <> wrote:
    >> Mark Seger wrote:

    >
    >>> In fact I avoid any syntax that
    >>> has a default such as "while (<>)" which is why I always use "while
    >>> ($var=<>)".

    >
    >
    > I agree completely with this one.
    >
    > Actually, it should most likely be:
    >
    > while ( my $var = <> )
    >


    As has been mentioned somewhere above, that's simply incorrect code in
    most circumstances. while (<handle>) { ... } is a special case and if you
    really must use a named variable the construct should be

    while (defined(my $var = <>)) { ... }

    Joost.
     
    Joost Diepenmaat, Dec 13, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    475
  2. Carlos Ribeiro
    Replies:
    11
    Views:
    712
    Alex Martelli
    Sep 17, 2004
  3. trans.  (T. Onoma)

    split on '' (and another for split -1)

    trans. (T. Onoma), Dec 27, 2004, in forum: Ruby
    Replies:
    10
    Views:
    226
    Florian Gross
    Dec 28, 2004
  4. Sam Kong
    Replies:
    5
    Views:
    255
    Rick DeNatale
    Aug 12, 2006
  5. Stanley Xu
    Replies:
    2
    Views:
    639
    Stanley Xu
    Mar 23, 2011
Loading...

Share This Page