Split, variable delimiter

Discussion in 'Perl Misc' started by Heath, Feb 21, 2006.

  1. Heath

    Heath Guest

    Hello,

    I'm using perl v5.8.7.

    When $_ is set to something like "\t\t\t4.34\t4.65\t3.25\t9.54\n",
    I do the following:

    my @line = split ' ';
    print "$#line, $line[0]\n";

    And get what I expect:

    3, 4.34

    But when I do this:

    my $delim = ' ';
    my @line = split $delim;
    print "$#line, $line[0]\n";
    if ($line[0] eq $_) {print "Equal\n"; }

    I get this:

    0, 4.34 4.65
    3.25 9.54

    Equal

    Which is not exactly what I'm after. So, why are these two
    snippets behaving differently? What can I do to make them
    equivalent?

    I'm probably missing something obvious. If so, please be nice. ;)
    Heath, Feb 21, 2006
    #1
    1. Advertising

  2. Heath

    Guest

    Heath wrote:
    > my $delim = ' ';
    > my @line = split $delim;


    You still need delims around your delim. Try something like /$delim/

    --
    http://DavidFilmer.com
    , Feb 21, 2006
    #2
    1. Advertising

  3. Heath wrote:
    > Hello,
    >
    > I'm using perl v5.8.7.
    >
    > When $_ is set to something like "\t\t\t4.34\t4.65\t3.25\t9.54\n",
    > I do the following:
    >
    > my @line = split ' ';
    > print "$#line, $line[0]\n";
    >
    > And get what I expect:
    >
    > 3, 4.34
    >
    > But when I do this:
    >
    > my $delim = ' ';
    > my @line = split $delim;
    > print "$#line, $line[0]\n";
    > if ($line[0] eq $_) {print "Equal\n"; }
    >
    > I get this:
    >
    > 0, 4.34 4.65
    > 3.25 9.54
    >
    > Equal
    >
    > Which is not exactly what I'm after. So, why are these two
    > snippets behaving differently? What can I do to make them
    > equivalent?


    first, look at: perldoc -f split | more
    or, browser-friendly:
    http://www.physiol.ox.ac.uk/Computing/Online_Documentation/Perl-5.8.6/functions/split.html

    see the part about the special case of split ' '

    i would say the best thing to do would be not to use split ' ' at all.
    ==============
    use strict; use warnings;

    $_ = "\t\t\t4.34\t4.65\t3.25\t9.54\n";

    $_ =~ s/^\s+//;

    my @line = split /\s/;
    print "$#line, $line[0]\n";

    my $delim = '\s';
    my @line2 = split /$delim/;
    print "$#line2, $line2[0]\n";
    it_says_BALLS_on_your forehead, Feb 21, 2006
    #3
  4. Heath

    Heath Guest

    I've tried using $delim, /$delim/, "$delim", '$delim', and even
    "\'$delim\'". All do the same as when using $delim.
    Heath, Feb 21, 2006
    #4
  5. Heath

    Heath Guest

    it_says_BALLS_on_your forehead wrote:
    > first, look at: perldoc -f split | more
    > or, browser-friendly:
    > http://www.physiol.ox.ac.uk/Computing/Online_Documentation/Perl-5.8.6/functions/split.html
    >
    > see the part about the special case of split ' '


    Yes, I read through that before I ever posted. The behavior I'm
    after is that of [split ' ']. I don't get that behavior when I pass
    the space char to split via a variable. I would simply just like to
    know why that is and how I can get that behavior by passing a variable,
    if it is possible at all.

    > i would say the best thing to do would be not to use split ' ' at all.
    > ==============
    > use strict; use warnings;
    >
    > $_ = "\t\t\t4.34\t4.65\t3.25\t9.54\n";
    >
    > $_ =~ s/^\s+//;
    >
    > my @line = split /\s/;
    > print "$#line, $line[0]\n";
    >
    > my $delim = '\s';
    > my @line2 = split /$delim/;
    > print "$#line2, $line2[0]\n";


    This works great, but it accomplishes the exact same thing as:

    ==============
    use strict; use warnings;

    $_ = "\t\t\t4.34\t4.65\t3.25\t9.54\n";

    my @line = split;
    print "$#line, $line[0]\n";
    ==============

    All I need is a value to assign to $delim such that a [split $delim]
    will give me the same behavior as a [split].
    Heath, Feb 21, 2006
    #5
  6. Heath

    Uri Guttman Guest

    >>>>> "H" == Heath <> writes:

    H> Yes, I read through that before I ever posted. The behavior I'm
    H> after is that of [split ' ']. I don't get that behavior when I pass
    H> the space char to split via a variable. I would simply just like to
    H> know why that is and how I can get that behavior by passing a variable,
    H> if it is possible at all.

    rtfm some more:

    As a special case, specifying a PATTERN of space
    (' ') will split on white space just as "split" with
    no arguments does. Thus, "split(' ')" can be used
    to emulate awk's default behavior, ...

    note that PATTERN is the actual literal passed to split so it can't be a
    variable. otherwise how could it tell / / from ' ' from $foo = ' '? this
    is a very odd way to get that special behavior as it is inband and very
    special cased. if you must have that vs other splits on demand, use a
    sub to handle your cased and do 2 different splits based on $foo eq '
    '. or use code refs for each split type. many ways to handle it.

    H> All I need is a value to assign to $delim such that a [split $delim]
    H> will give me the same behavior as a [split].

    can't be done. so choose another solution.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 21, 2006
    #6
  7. Heath

    robic0 Guest

    On 21 Feb 2006 10:11:56 -0800, "Heath" <> wrote:

    >Hello,
    >
    > I'm using perl v5.8.7.
    >
    > When $_ is set to something like "\t\t\t4.34\t4.65\t3.25\t9.54\n",
    > I do the following:
    >
    > my @line = split ' ';
    > print "$#line, $line[0]\n";
    >
    > And get what I expect:
    >
    > 3, 4.34
    >
    > But when I do this:
    >
    > my $delim = ' ';
    > my @line = split $delim;
    > print "$#line, $line[0]\n";
    > if ($line[0] eq $_) {print "Equal\n"; }
    >
    > I get this:
    >
    > 0, 4.34 4.65
    > 3.25 9.54
    >
    > Equal
    >
    > Which is not exactly what I'm after. So, why are these two
    > snippets behaving differently? What can I do to make them
    > equivalent?
    >
    > I'm probably missing something obvious. If so, please be nice. ;)



    It is a bug (I mean a feature) of split. According to the docs
    the Perl parser seems to look for the single quoted space ' ' and that
    differentiates it from a space " " as a pattern. So obviously the single
    quote is as significant as the space. However, the 3 character string is
    parsed as a first level parse. So you can't even assign $delim = "' '" or
    $delim = "" (null string) ... which is a special case as well ass the ' '.

    It would appear to be a very useful case since it populates every array position
    with a non-whitespace. If you want to run dynamic patterns, the ' ' case will have
    to be an exclusion, tested for and hardcoded.

    ie: if ($delim ne ' ') {split $delim;} else {split ' '}

    Quote:
    As a special case, specifying a PATTERN of space (' ') will split on white space
    just as split with no arguments does. Thus, split(' ') can be used to emulate awk's
    default behavior, whereas split(/ /) will give you as many null initial fields as
    there are leading spaces. A split on /\s+/ is like a split(' ') except that any
    leading whitespace produces a null first field. A split with no arguments really
    does a split(' ', $_) internally.

    Some code:

    use strict;
    use warnings;


    $_ = "\t\t\t4.34\t4.65\t3.25\t9.54\n";

    my @line = split /' '/;
    print "$#line", @line, "\n";
    if ($line[0] eq $_) {print "Equal\n"; }

    my $dlim = ' ';
    @line = split /$dlim/;
    print "$#line", @line, "\n";
    if ($line[0] eq $_) {print "Equal\n"; }

    print "-------------\n";

    @line = split ' ';
    print "$#line", @line, "\n";
    if ($line[0] eq $_) {print "Equal\n"; }

    $dlim = '\s+';
    @line = split /$dlim/;
    print "$#line", @line, "\n";
    if ($line[0] eq $_) {print "Equal\n"; }

    print "---------\nbut\n";
    $dlim = ' ';
    @line = split /$dlim/;
    print "$#line", @line, "\n";
    if ($line[0] eq $_) {print "Equal\n"; }

    __END__
    Output:
    0 4.34 4.65 3.25 9.54

    Equal
    0 4.34 4.65 3.25 9.54

    Equal
    -------------
    34.344.653.259.54
    44.344.653.259.54
    ---------
    but
    0 4.34 4.65 3.25 9.54

    Equal
    robic0, Feb 21, 2006
    #7
  8. Heath

    robic0 Guest

    On Tue, 21 Feb 2006 15:58:06 -0500, Uri Guttman <> wrote:

    >>>>>> "H" == Heath <> writes:

    >
    > H> Yes, I read through that before I ever posted. The behavior I'm
    > H> after is that of [split ' ']. I don't get that behavior when I pass
    > H> the space char to split via a variable. I would simply just like to
    > H> know why that is and how I can get that behavior by passing a variable,
    > H> if it is possible at all.
    >
    >rtfm some more:
    >
    > As a special case, specifying a PATTERN of space
    > (' ') will split on white space just as "split" with
    > no arguments does. Thus, "split(' ')" can be used
    > to emulate awk's default behavior, ...
    >
    >note that PATTERN is the actual literal passed to split so it can't be a
    >variable. otherwise how could it tell / / from ' ' from $foo = ' '? this


    I don't know, I would consider this a bug, aka, left out check. Within split, if
    the $foo name is passed as a literal name, the contents have to be obtained.
    So if $foo = "' '", it should be fairly obvious what the meaning is.

    But I don't think some intrinsics work that way. I think as far as the Pattern
    in split, the parser looks for a split ' ' or split pattern and internally changes
    the call to a different function with different parameters, than any other form of split.
    There may be several internal split functions.
    Since it has to be parsed anyway, its easier to redirect different "forms" to predefined
    functions that handle specific ones. Thereby speeding up the processor.

    >is a very odd way to get that special behavior as it is inband and very
    >special cased. if you must have that vs other splits on demand, use a
    >sub to handle your cased and do 2 different splits based on $foo eq '
    >'. or use code refs for each split type. many ways to handle it.
    >
    > H> All I need is a value to assign to $delim such that a [split $delim]
    > H> will give me the same behavior as a [split].
    >
    >can't be done. so choose another solution.
    >
    >uri
    robic0, Feb 21, 2006
    #8
  9. Heath

    Uri Guttman Guest

    >>>>> "r" == robic0 <robic0> writes:


    r> I don't know, I would consider this a bug, aka, left out
    r> check. Within split, if the $foo name is passed as a literal name,
    r> the contents have to be obtained. So if $foo = "' '", it should be
    r> fairly obvious what the meaning is.

    i consider you a genomic bug.

    r> But I don't think some intrinsics work that way. I think as far as
    r> the Pattern in split, the parser looks for a split ' ' or split
    r> pattern and internally changes the call to a different function
    r> with different parameters, than any other form of split. There may
    r> be several internal split functions. Since it has to be parsed
    r> anyway, its easier to redirect different "forms" to predefined
    r> functions that handle specific ones. Thereby speeding up the
    r> processor.

    speeding up the processor? what kind of crack are you smoking? this
    whole discussion has nothing to do with the speed of split. the various
    special behaviors of split do not need seperate implmentations.

    just another useless reply to a troll,

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 21, 2006
    #9
  10. Heath

    robic0 Guest

    On Tue, 21 Feb 2006 18:52:51 -0500, Uri Guttman <> wrote:

    >>>>>> "r" == robic0 <robic0> writes:

    >
    >
    > r> I don't know, I would consider this a bug, aka, left out
    > r> check. Within split, if the $foo name is passed as a literal name,
    > r> the contents have to be obtained. So if $foo = "' '", it should be
    > r> fairly obvious what the meaning is.
    >
    >i consider you a genomic bug.
    >
    > r> But I don't think some intrinsics work that way. I think as far as
    > r> the Pattern in split, the parser looks for a split ' ' or split
    > r> pattern and internally changes the call to a different function
    > r> with different parameters, than any other form of split. There may
    > r> be several internal split functions. Since it has to be parsed
    > r> anyway, its easier to redirect different "forms" to predefined
    > r> functions that handle specific ones. Thereby speeding up the
    > r> processor.
    >
    >speeding up the processor? what kind of crack are you smoking? this
    >whole discussion has nothing to do with the speed of split. the various
    >special behaviors of split do not need seperate implmentations.
    >

    I'm speechless.. You just discounted all compiled and semi-compiled (fixup)
    languages. You must think Perl core is written in Perl.

    The "Processor" is commonly known as the "engine", the core. Perl follows
    that and has multiple core implementations of intrinsics, it does a modified
    compile at loadtime and further compiles at runtime. Try compiling
    C/C++ code with standard library calls, then look at the assembly.

    >just another useless reply to a troll,
    >
    >uri
    robic0, Feb 22, 2006
    #10
  11. Heath

    thrill5 Guest

    Can be done. Set $delim to "\s" not ' '.

    Scott

    "Uri Guttman" <> wrote in message
    news:...
    >>>>>> "H" == Heath <> writes:

    >
    > H> Yes, I read through that before I ever posted. The behavior I'm
    > H> after is that of [split ' ']. I don't get that behavior when I pass
    > H> the space char to split via a variable. I would simply just like to
    > H> know why that is and how I can get that behavior by passing a
    > variable,
    > H> if it is possible at all.
    >
    > rtfm some more:
    >
    > As a special case, specifying a PATTERN of space
    > (' ') will split on white space just as "split" with
    > no arguments does. Thus, "split(' ')" can be used
    > to emulate awk's default behavior, ...
    >
    > note that PATTERN is the actual literal passed to split so it can't be a
    > variable. otherwise how could it tell / / from ' ' from $foo = ' '? this
    > is a very odd way to get that special behavior as it is inband and very
    > special cased. if you must have that vs other splits on demand, use a
    > sub to handle your cased and do 2 different splits based on $foo eq '
    > '. or use code refs for each split type. many ways to handle it.
    >
    > H> All I need is a value to assign to $delim such that a [split $delim]
    > H> will give me the same behavior as a [split].
    >
    > can't be done. so choose another solution.
    >
    > uri
    >
    > --
    > Uri Guttman ------ --------
    > http://www.stemsystems.com
    > --Perl Consulting, Stem Development, Systems Architecture, Design and
    > Coding-
    > Search or Offer Perl Jobs ----------------------------
    > http://jobs.perl.org
    thrill5, Feb 22, 2006
    #11
  12. Heath

    Guest

    robic0 wrote:

    >
    > It is a bug (I mean a feature) of split. According to the docs


    Uh, what part of the docs are you refering to?

    > the Perl parser seems to look for the single quoted space ' ' and that
    > differentiates it from a space " " as a pattern.


    It does not. ' ', " ", q{ }, qq{ } are all the same in this context. They
    are differentiated from / /, qr{ }, and $x where $x eq ' ';

    > So obviously the single
    > quote is as significant as the space.


    $ perl -le 'my $x=" foo bar "; print ":$_:" foreach split " ", $x'
    :foo:
    :bar:
    $ perl -le 'my $x=" foo bar "; print ":$_:" foreach split q{ }, $x'
    :foo:
    :bar:
    $ perl -le 'my $x=" foo bar "; print ":$_:" foreach split qq{ }, $x'
    :foo:
    :bar:
    $ perl -le 'my $x=" foo bar "; print ":$_:" foreach split qr{ }, $x'
    ::
    :foo:
    :bar:
    $ perl -le 'my $x=" foo bar "; print ":$_:" foreach split / /, $x'
    ::
    :foo:
    :bar:

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Feb 22, 2006
    #12
  13. Heath

    robic0 Guest

    On 22 Feb 2006 01:58:43 GMT, wrote:

    >robic0 wrote:
    >
    >>
    >> It is a bug (I mean a feature) of split. According to the docs

    >
    >Uh, what part of the docs are you refering to?
    >
    >> the Perl parser seems to look for the single quoted space ' ' and that
    >> differentiates it from a space " " as a pattern.

    >
    >It does not. ' ', " ", q{ }, qq{ } are all the same in this context. They
    >are differentiated from / /, qr{ }, and $x where $x eq ' ';
    >

    split ' '; is NOT the same context as split " "; nor
    $dl = ' '; split $dl;

    The discussion is about the special parsing done for "split ' '" context,
    it is not about what, for exapmple split $dl; where $dl = ' '; means.

    Wake up man..........

    Heres the posting quotes of some internal posts (which the argument hinges):
    If you don't want to read this, the jist is its a compiler with multiple form
    intrisick functions.......
    ==============================================

    On Tue, 21 Feb 2006 15:58:06 -0500, Uri Guttman <> wrote:

    >>>>>> "H" == Heath <> writes:

    >
    > H> Yes, I read through that before I ever posted. The behavior I'm
    > H> after is that of [split ' ']. I don't get that behavior when I pass
    > H> the space char to split via a variable. I would simply just like to
    > H> know why that is and how I can get that behavior by passing a variable,
    > H> if it is possible at all.
    >
    >rtfm some more:
    >
    > As a special case, specifying a PATTERN of space
    > (' ') will split on white space just as "split" with
    > no arguments does. Thus, "split(' ')" can be used
    > to emulate awk's default behavior, ...
    >
    >note that PATTERN is the actual literal passed to split so it can't be a
    >variable. otherwise how could it tell / / from ' ' from $foo = ' '? this


    I don't know, I would consider this a bug, aka, left out check. Within split, if
    the $foo name is passed as a literal name, the contents have to be obtained.
    So if $foo = "' '", it should be fairly obvious what the meaning is.

    But I don't think some intrinsics work that way. I think as far as the Pattern
    in split, the parser looks for a split ' ' or split pattern and internally changes
    the call to a different function with different parameters, than any other form of split.
    There may be several internal split functions.
    Since it has to be parsed anyway, its easier to redirect different "forms" to predefined
    functions that handle specific ones. Thereby speeding up the processor.

    >is a very odd way to get that special behavior as it is inband and very
    >special cased. if you must have that vs other splits on demand, use a
    >sub to handle your cased and do 2 different splits based on $foo eq '
    >'. or use code refs for each split type. many ways to handle it.
    >
    > H> All I need is a value to assign to $delim such that a [split $delim]
    > H> will give me the same behavior as a [split].
    >
    >can't be done. so choose another solution.
    >
    >uri

    ====================

    On Tue, 21 Feb 2006 18:52:51 -0500, Uri Guttman <> wrote:

    >>>>>> "r" == robic0 <robic0> writes:

    >
    >
    > r> I don't know, I would consider this a bug, aka, left out
    > r> check. Within split, if the $foo name is passed as a literal name,
    > r> the contents have to be obtained. So if $foo = "' '", it should be
    > r> fairly obvious what the meaning is.
    >
    >i consider you a genomic bug.
    >
    > r> But I don't think some intrinsics work that way. I think as far as
    > r> the Pattern in split, the parser looks for a split ' ' or split
    > r> pattern and internally changes the call to a different function
    > r> with different parameters, than any other form of split. There may
    > r> be several internal split functions. Since it has to be parsed
    > r> anyway, its easier to redirect different "forms" to predefined
    > r> functions that handle specific ones. Thereby speeding up the
    > r> processor.
    >
    >speeding up the processor? what kind of crack are you smoking? this
    >whole discussion has nothing to do with the speed of split. the various
    >special behaviors of split do not need seperate implmentations.
    >

    I'm speechless.. You just discounted all compiled and semi-compiled (fixup)
    languages. You must think Perl core is written in Perl.

    The "Processor" is commonly known as the "engine", the core. Perl follows
    that and has multiple core implementations of intrinsics, it does a modified
    compile at loadtime and further compiles at runtime. Try compiling
    C/C++ code with standard library calls, then look at the assembly.

    >just another useless reply to a troll,
    >
    >uri



    On 21 Feb 2006 10:11:56 -0800, "Heath" <> wrote:

    >Hello,
    >
    > I'm using perl v5.8.7.
    >
    > When $_ is set to something like "\t\t\t4.34\t4.65\t3.25\t9.54\n",
    > I do the following:
    >
    > my @line = split ' ';
    > print "$#line, $line[0]\n";
    >
    > And get what I expect:
    >
    > 3, 4.34
    >
    > But when I do this:
    >
    > my $delim = ' ';
    > my @line = split $delim;
    > print "$#line, $line[0]\n";
    > if ($line[0] eq $_) {print "Equal\n"; }
    >
    > I get this:
    >
    > 0, 4.34 4.65
    > 3.25 9.54
    >
    > Equal
    >
    > Which is not exactly what I'm after. So, why are these two
    > snippets behaving differently? What can I do to make them
    > equivalent?
    >
    > I'm probably missing something obvious. If so, please be nice. ;)



    It is a bug (I mean a feature) of split. According to the docs
    the Perl parser seems to look for the single quoted space ' ' and that
    differentiates it from a space " " as a pattern. So obviously the single
    quote is as significant as the space. However, the 3 character string is
    parsed as a first level parse. So you can't even assign $delim = "' '" or
    $delim = "" (null string) ... which is a special case as well ass the ' '.

    It would appear to be a very useful case since it populates every array position
    with a non-whitespace. If you want to run dynamic patterns, the ' ' case will have
    to be an exclusion, tested for and hardcoded.

    ie: if ($delim ne ' ') {split $delim;} else {split ' '}

    Quote:
    As a special case, specifying a PATTERN of space (' ') will split on white space
    just as split with no arguments does. Thus, split(' ') can be used to emulate awk's
    default behavior, whereas split(/ /) will give you as many null initial fields as
    there are leading spaces. A split on /\s+/ is like a split(' ') except that any
    leading whitespace produces a null first field. A split with no arguments really
    does a split(' ', $_) internally.
    robic0, Feb 22, 2006
    #13
  14. Heath

    robic0 Guest

    On Tue, 21 Feb 2006 18:17:24 -0800, robic0 wrote:

    >On 22 Feb 2006 01:58:43 GMT, wrote:
    >
    >>robic0 wrote:
    >>
    >>>
    >>> It is a bug (I mean a feature) of split. According to the docs

    >>
    >>Uh, what part of the docs are you refering to?
    >>
    >>> the Perl parser seems to look for the single quoted space ' ' and that
    >>> differentiates it from a space " " as a pattern.

    >>
    >>It does not. ' ', " ", q{ }, qq{ } are all the same in this context. They
    >>are differentiated from / /, qr{ }, and $x where $x eq ' ';
    >>

    >split ' '; is NOT the same context as split " "; nor
    >$dl = ' '; split $dl;
    >
    >The discussion is about the special parsing done for "split ' '" context,
    >it is not about what, for exapmple split $dl; where $dl = ' '; means.
    >
    >Wake up man..........
    >
    >Heres the posting quotes of some internal posts (which the argument hinges):
    >If you don't want to read this, the jist is its a compiler with multiple form
    >intrisick functions.......
    >==============================================
    >
    >On Tue, 21 Feb 2006 15:58:06 -0500, Uri Guttman <> wrote:
    >
    >>>>>>> "H" == Heath <> writes:

    >>
    >> H> Yes, I read through that before I ever posted. The behavior I'm
    >> H> after is that of [split ' ']. I don't get that behavior when I pass
    >> H> the space char to split via a variable. I would simply just like to
    >> H> know why that is and how I can get that behavior by passing a variable,
    >> H> if it is possible at all.
    >>
    >>rtfm some more:
    >>
    >> As a special case, specifying a PATTERN of space
    >> (' ') will split on white space just as "split" with
    >> no arguments does. Thus, "split(' ')" can be used
    >> to emulate awk's default behavior, ...
    >>
    >>note that PATTERN is the actual literal passed to split so it can't be a
    >>variable. otherwise how could it tell / / from ' ' from $foo = ' '? this

    >
    >I don't know, I would consider this a bug, aka, left out check. Within split, if
    >the $foo name is passed as a literal name, the contents have to be obtained.
    >So if $foo = "' '", it should be fairly obvious what the meaning is.
    >
    >But I don't think some intrinsics work that way. I think as far as the Pattern
    >in split, the parser looks for a split ' ' or split pattern and internally changes
    >the call to a different function with different parameters, than any other form of split.
    >There may be several internal split functions.
    >Since it has to be parsed anyway, its easier to redirect different "forms" to predefined
    >functions that handle specific ones. Thereby speeding up the processor.
    >
    >>is a very odd way to get that special behavior as it is inband and very
    >>special cased. if you must have that vs other splits on demand, use a
    >>sub to handle your cased and do 2 different splits based on $foo eq '
    >>'. or use code refs for each split type. many ways to handle it.
    >>
    >> H> All I need is a value to assign to $delim such that a [split $delim]
    >> H> will give me the same behavior as a [split].
    >>
    >>can't be done. so choose another solution.
    >>
    >>uri

    >====================
    >
    >On Tue, 21 Feb 2006 18:52:51 -0500, Uri Guttman <> wrote:
    >
    >>>>>>> "r" == robic0 <robic0> writes:

    >>
    >>
    >> r> I don't know, I would consider this a bug, aka, left out
    >> r> check. Within split, if the $foo name is passed as a literal name,
    >> r> the contents have to be obtained. So if $foo = "' '", it should be
    >> r> fairly obvious what the meaning is.
    >>
    >>i consider you a genomic bug.
    >>
    >> r> But I don't think some intrinsics work that way. I think as far as
    >> r> the Pattern in split, the parser looks for a split ' ' or split
    >> r> pattern and internally changes the call to a different function
    >> r> with different parameters, than any other form of split. There may
    >> r> be several internal split functions. Since it has to be parsed
    >> r> anyway, its easier to redirect different "forms" to predefined
    >> r> functions that handle specific ones. Thereby speeding up the
    >> r> processor.
    >>
    >>speeding up the processor? what kind of crack are you smoking? this
    >>whole discussion has nothing to do with the speed of split. the various
    >>special behaviors of split do not need seperate implmentations.
    >>

    >I'm speechless.. You just discounted all compiled and semi-compiled (fixup)
    >languages. You must think Perl core is written in Perl.
    >
    >The "Processor" is commonly known as the "engine", the core. Perl follows
    >that and has multiple core implementations of intrinsics, it does a modified
    >compile at loadtime and further compiles at runtime.


    OR it does just a HARDCODED replacement of the "split ' '" form!
    Which is a hack patch of some feeble minded Perl programmer.
    There are no other options at all!!
    Eh?
    robic0, Feb 22, 2006
    #14
  15. Heath

    Uri Guttman Guest

    >>>>> "t" == thrill5 <> writes:

    t> Can be done. Set $delim to "\s" not ' '.

    rtfm some more please. first off you want \s+ to get something like
    split ' '. and ' ' is not only special cased for splitting on any
    whitespace but it emulates awk's behavior of skipping leading white
    space.

    If EXPR is omitted, splits the $_ string. If PATTERN is also
    omitted, splits on whitespace (after skipping any leading
    whitespace). Anything matching PATTERN is taken to be a
    delimiter separating the fields. (Note that the delimiter may
    be longer than one character.)

    A "split" on "/\s+/" is like a "split(' ')" except that any
    leading whitespace produces a null first field. A "split" with
    no arguments really does a "split(' ', $_)" internally.

    hmm, seems to me that my point about not being able to pass in ' ' as a
    value still holds up. did you think if it were just a special value
    then they would not document it as such? it is not even a value since
    the split is done with \s+ it is not splitting on ' ' (which is a
    single space). so do you have another comment on this?

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 22, 2006
    #15
  16. Heath

    Uri Guttman Guest

    >>>>> "r" == robic0 <robic0> writes:

    r> I'm speechless.. You just discounted all compiled and semi-compiled
    r> (fixup) languages. You must think Perl core is written in Perl.

    albeit you were keyboardless as well. we must look into fixing that.

    r> The "Processor" is commonly known as the "engine", the core. Perl
    r> follows that and has multiple core implementations of intrinsics,
    r> it does a modified compile at loadtime and further compiles at
    r> runtime. Try compiling C/C++ code with standard library calls, then
    r> look at the assembly.

    as that statement shows a total lack of knowledge about perl's
    internals, i will treat your comment with the respect it deserves:

    BLUBLBBLBABLUBALLABEELALABEBBE!

    and please don't ever try to teach me (hell, anyone) about computer
    stuff. i have farted away more computer knowledge than you could ever
    breathe in. but you should try to breathe in some anyhow as it will help
    you for sure.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 22, 2006
    #16
  17. Heath

    Bo Lindbergh Guest

    In article <>,
    Uri Guttman <> wrote:

    > rtfm some more:
    >
    > As a special case, specifying a PATTERN of space
    > (' ') will split on white space just as "split" with
    > no arguments does. Thus, "split(' ')" can be used
    > to emulate awk's default behavior, ...
    >
    > note that PATTERN is the actual literal passed to split so it can't be a
    > variable. otherwise how could it tell / / from ' ' from $foo = ' '?


    By way of the qr construct. In a hypothetical future Perl version

    my $delim=' ';
    my @fields=split($delim);

    would be equivalent to

    my @fields=split(' ');

    while

    my $delim=qr/ /;
    my @fields=split($delim);

    would be equivalent to

    my @fields=split(/ /);

    and everyone would be happy. Right?


    /Bo Lindbergh
    Bo Lindbergh, Feb 22, 2006
    #17
  18. Uri Guttman <> writes:

    > rtfm some more please. first off you want \s+ to get something like
    > split ' '. and ' ' is not only special cased for splitting on any
    > whitespace but it emulates awk's behavior of skipping leading white
    > space.


    The documentation explains this quite well. However, it never says
    that ' ' can't be passed in a variable, as far as I could find. Since
    this is such a special case (Is there any other case in perl when this
    is true?), perhaps the documentation should have a few lines added to
    make that clear. Something like:

    This special case only works when a single space is given to split
    literally. Passing a single space in a variable causes split to
    split on a single space, instead of using this special behavior.

    $delim = ' ';
    @s = split $delim, ' A B CD E';

    Results in @s = ( '', 'A', 'B', 'CD', 'E' );

    I'm sure that could be worded more accurately.


    --
    Aaron --
    http://360.yahoo.com/aaron_baugher
    Aaron Baugher, Feb 22, 2006
    #18
  19. Heath

    Guest

    robic0 wrote:
    > On Tue, 21 Feb 2006 18:17:24 -0800, robic0 wrote:
    >
    > >On 22 Feb 2006 01:58:43 GMT, wrote:
    > >
    > >>robic0 wrote:
    > >>
    > >>>
    > >>> It is a bug (I mean a feature) of split. According to the docs
    > >>
    > >>Uh, what part of the docs are you refering to?
    > >>
    > >>> the Perl parser seems to look for the single quoted space ' ' and
    > >>> that differentiates it from a space " " as a pattern.
    > >>
    > >>It does not. ' ', " ", q{ }, qq{ } are all the same in this context.
    > >>They are differentiated from / /, qr{ }, and $x where $x eq ' ';
    > >>

    > >split ' '; is NOT the same context as split " ";


    Well, at least you were smart enough to snip the examples proving you were
    wrong before you re-iterated your lies.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Feb 22, 2006
    #19
  20. Heath

    Uri Guttman Guest

    >>>>> "AB" == Aaron Baugher <> writes:

    AB> Uri Guttman <> writes:
    >> rtfm some more please. first off you want \s+ to get something like
    >> split ' '. and ' ' is not only special cased for splitting on any
    >> whitespace but it emulates awk's behavior of skipping leading white
    >> space.


    AB> The documentation explains this quite well. However, it never says
    AB> that ' ' can't be passed in a variable, as far as I could find. Since
    AB> this is such a special case (Is there any other case in perl when this
    AB> is true?), perhaps the documentation should have a few lines added to
    AB> make that clear. Something like:

    AB> This special case only works when a single space is given to split
    AB> literally. Passing a single space in a variable causes split to
    AB> split on a single space, instead of using this special behavior.

    AB> $delim = ' ';
    AB> @s = split $delim, ' A B CD E';

    AB> Results in @s = ( '', 'A', 'B', 'CD', 'E' );

    AB> I'm sure that could be worded more accurately.

    i agree the docs are not the best on this special case.

    and as they say patches welcome and perl5porters is over there --->

    :)

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 22, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark Fox

    Delimiter Split

    Mark Fox, Aug 11, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    3,318
    Chris R. Timmons
    Aug 11, 2003
  2. Kevin Spencer
    Replies:
    5
    Views:
    1,185
    =?Utf-8?B?UENL?=
    Jan 21, 2004
  3. Replies:
    9
    Views:
    353
    Paul McGuire
    Nov 16, 2006
  4. Replies:
    18
    Views:
    4,974
    Michael Jung
    Aug 11, 2013
  5. rewonka
    Replies:
    10
    Views:
    643
    M.-A. Lemburg
    Mar 19, 2009
Loading...

Share This Page