resolve single line with multiple items into mutliple lines, single items

Discussion in 'Perl Misc' started by ela, Apr 5, 2009.

  1. ela

    ela Guest

    Old line(columns tab-delimited):

    Col1 Col2 Col3 ... Coln
    A B1@B2 C ... N1@N2@N3

    New lines
    A B1 C .. N1
    A B1 C .. N2
    A B1 C .. N3
    A B2 C .. N1
    A B2 C .. N2
    A B2 C .. N3

    The problem is: although pattern matching can recognize "@", but how to
    write the code generically so to get all N1, N2 and N3, such that the number
    of items aren't known beforehand?
     
    ela, Apr 5, 2009
    #1
    1. Advertising

  2. ela

    Willem Guest

    Re: resolve single line with multiple items into mutliple lines,single items

    ela wrote:
    ) Old line(columns tab-delimited):
    )
    ) Col1 Col2 Col3 ... Coln
    ) A B1@B2 C ... N1@N2@N3
    )
    ) New lines
    ) A B1 C .. N1
    ) A B1 C .. N2
    ) A B1 C .. N3
    ) A B2 C .. N1
    ) A B2 C .. N2
    ) A B2 C .. N3
    )
    ) The problem is: although pattern matching can recognize "@", but how to
    ) write the code generically so to get all N1, N2 and N3, such that the number
    ) of items aren't known beforehand?

    Well obviously first you create an array of arrays for the rows and columns.

    And then how about something which looks a bit like:

    for $i (1 .. $n) {
    @columns = map {
    my @row = @$_;
    map {
    (@row[0..($i-1)], $_, @row[($i+1).. $n])
    } split('@', $row[$i]);
    } @columns;
    }

    But of course this is overly complex and can probably be redoces to a
    clever one-liner...


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
     
    Willem, Apr 5, 2009
    #2
    1. Advertising

  3. ela <> wrote:
    > Old line(columns tab-delimited):
    >
    > Col1 Col2 Col3 ... Coln
    > A B1@B2 C ... N1@N2@N3
    >
    > New lines
    > A B1 C .. N1
    > A B1 C .. N2
    > A B1 C .. N3
    > A B2 C .. N1
    > A B2 C .. N2
    > A B2 C .. N3
    >
    > The problem is: although pattern matching can recognize "@", but how to
    > write the code generically so to get all N1, N2 and N3, such that the number
    > of items aren't known beforehand?



    -------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;

    $_ = "A\tB1\@B2\tC\tN1\@N2\@N3\n";

    #1 while s/(.*?)([^\t]+)\@([^\t\n]+)(.*\n)/$1$2$4$1$3$4/;

    1 while s/(.*?) # before pair to expand
    ([^\t]+) # left value
    \@
    ([^\t\n]+) # right value
    (.*\n) # after pair to expand
    /$1$2$4$1$3$4/x;

    print;
    -------------------------


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Apr 5, 2009
    #3
  4. ela

    ela Guest

    I really thank Willem & McClellan, who proposed solns. Yet, there are too
    many symbols that can't be Googled, I fail to understand their codes....
     
    ela, Apr 5, 2009
    #4
  5. "ela" <> wrote:
    >Old line(columns tab-delimited):
    >
    >Col1 Col2 Col3 ... Coln
    >A B1@B2 C ... N1@N2@N3
    >
    >New lines
    >A B1 C .. N1
    >A B1 C .. N2
    >A B1 C .. N3
    >A B2 C .. N1
    >A B2 C .. N2
    >A B2 C .. N3
    >
    >The problem is: although pattern matching can recognize "@", but how to
    >write the code generically so to get all N1, N2 and N3, such that the number
    >of items aren't known beforehand?


    split() line at tab (to get indivudual column), then foreach() column
    split() at '@' to get list of individual values.

    This automatically leads to a nested loop, which you can use nicely to
    print the lines in the desired order.

    jue
     
    Jürgen Exner, Apr 5, 2009
    #5
  6. ela <> wrote:
    > I really thank Willem & McClellan, who proposed solns. Yet, there are too
    > many symbols that can't be Googled,



    Good, because you don't want to find random crap on the interweb.

    You want to find focused and accurate information on your own hard disk.

    perldoc perlrequick

    perldoc perlretut

    perldoc perlre


    > I fail to understand their codes....



    If you ask specifice questions about specific bits of code
    (after trying to find out in the std docs first), you will
    likely get help here.

    "I fail to understand their codes" is too general for us
    to be able to help you.


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Apr 5, 2009
    #6
  7. ela

    ela Guest

    > split() line at tab (to get indivudual column), then foreach() column
    > split() at '@' to get list of individual values.
    >
    > This automatically leads to a nested loop, which you can use nicely to
    > print the lines in the desired order.
    >
    > jue


    It seems that this is also a direction, can foreach() be recursively used?
    Because I don't want to write "n" foreach()'s.
     
    ela, Apr 5, 2009
    #7
  8. "ela" <> wrote:
    >> split() line at tab (to get indivudual column), then foreach() column
    >> split() at '@' to get list of individual values.
    >>
    >> This automatically leads to a nested loop, which you can use nicely to
    >> print the lines in the desired order.

    >
    >It seems that this is also a direction, can foreach() be recursively used?


    ???
    Recursion and loops are two different ways to achive the same result:
    repeating the execution of some code with modified data. Yes, of course
    you can mix them as you like, but why would you want to?

    >Because I don't want to write "n" foreach()'s.


    Having said that, I spoke too hastely. Nested foreach() are great to get
    the individual values and store them e.g. in an AoA.
    But creating the output within the same loop is very awkward and you
    will be far better of storing the data first and using a second loop as
    suggested by others or by using a recursive algorithm.

    jue
     
    Jürgen Exner, Apr 5, 2009
    #8
  9. ela

    Guest

    On Sun, 5 Apr 2009 14:50:04 +0800, "ela" <> wrote:

    >Old line(columns tab-delimited):
    >
    >Col1 Col2 Col3 ... Coln
    >A B1@B2 C ... N1@N2@N3
    >
    >New lines
    >A B1 C .. N1
    >A B1 C .. N2
    >A B1 C .. N3
    >A B2 C .. N1
    >A B2 C .. N2
    >A B2 C .. N3
    >
    >The problem is: although pattern matching can recognize "@", but how to
    >write the code generically so to get all N1, N2 and N3, such that the number
    >of items aren't known beforehand?
    >


    I just saw this. I didn't read the other posted responces that may have
    actually solved this apparent easy problem.

    From now on, not only will you Chinese, pigeon-English speaking, non-Perl
    programming, American dollar sucking folks have to provide some DOLLA'S,
    for the solution (that is what you want isin't it, source and all?),
    but you will have to LEARN ENOUGH CORRECT ENLISH TO PROPERLY EXPLAIN THE
    PROBLEM !!!!

    If this takes hiring an Amrican (English first language) translator, then all the
    better. Dok, dac, toa, dit, do, don, just don't cut it.

    -sln
     
    , Apr 6, 2009
    #9
  10. ela

    ccc31807 Guest

    Re: resolve single line with multiple items into mutliple lines,single items

    Some will say this is a simple minded solution, and maybe it is, but
    FWIW here's my contribution. This decomposes your data into a data
    structure in memory. It's dynamic in the sense that it doesn't matter
    how many records you have or where the @'s are, as long as you have
    only two levels. All you have to do then is print it out. I have used
    Dumper simply because I'm to lazy to finish it.

    CODE:
    use strict;
    use warnings;
    use Data::Dumper;

    while (<DATA>)
    {
    my @rest = split /\t/;
    my $num = @rest;
    for (my $i = 0; $i < $num; $i++)
    {
    if ($rest[$i] =~ /@/)
    {
    $rest[$i] = [split /@/, $rest[$i]];
    }
    print qq(\t$rest[$i]\n);
    }
    print "\nData Structure via Dumper is:\n";
    print Dumper(@rest);
    }

    exit(0);

    __DATA__
    A B1@B2 C d e f N1@N2@N3

    OUTPUT:

    C:\PerlLearn>perl multiple.plx
    A
    ARRAY(0x235348)
    C
    d
    e
    f
    ARRAY(0x182471c)

    Data Structure via Dumper is:
    $VAR1 = 'A';
    $VAR2 = [
    'B1',
    'B2'
    ];
    $VAR3 = 'C';
    $VAR4 = 'd';
    $VAR5 = 'e';
    $VAR6 = 'f';
    $VAR7 = [
    'N1',
    'N2',
    'N3
    '
    ];

    C:\PerlLearn>
     
    ccc31807, Apr 6, 2009
    #10
  11. ela

    Uri Guttman Guest

    Re: resolve single line with multiple items into mutliple lines,single items

    >>>>> "BM" == Ben Morrow <> writes:

    >> exit(0);


    BM> There is no need to exit() from a Perl program under normal
    BM> circumstances. Falling off the end will exit successfully.

    i like to have explicit exits in my main program. i usually keep the top
    level inline code very short with a few key lexicals and top sub calls
    and then exit(). then come the subs in some semblance of order. arg
    parsing and help/usage subs always go to the bottom out of the way. this
    is how i teach to write scripts so they are easy to develop AND
    read. and the explicit exit tells you the top level code is done and you
    don't have to scan for it (or fall to the bottom) to see any more main
    level code.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
     
    Uri Guttman, Apr 6, 2009
    #11
  12. ela

    ccc31807 Guest

    Re: resolve single line with multiple items into mutliple lines,single items

    On Apr 6, 1:49 pm, Uri Guttman <> wrote:
    > i like to have explicit exits in my main program. i usually keep the top
    > level inline code very short with a few key lexicals and top sub calls
    > and then exit(). then come the subs in some semblance of order. arg
    > parsing and help/usage subs always go to the bottom out of the way. this
    > is how i teach to write scripts so they are easy to develop AND
    > read. and the explicit exit tells you the top level code is done and you
    > don't have to scan for it (or fall to the bottom) to see any more main
    > level code.


    I agree fully.

    As a matter of style, you can write functions that only receive
    arguments and return values with no side effects or assignments
    withing the functions, or you can write functions that make
    assignments and have side effects.

    Philosophically, I'm inclined to the first style, and attempt to write
    in that style.

    In practice, I normally write in the second style, so that my 'main'
    program is very short and consists on of a sequence of function calls
    (followed by exit(0)). The bulk of the work, including variable
    assignments, are done by my user defined functions.

    I'm tending now to use a lot of modules, so that my 'main' program
    still consists of sequences of function calls, my user defined
    functions still avoid side effects and assignments as much as
    possible, and the dirty work is done in the modules. I don't
    particularly like this, and my style will probably continue to change.

    Your thoughts?

    CC
     
    ccc31807, Apr 6, 2009
    #12
  13. ela

    Uri Guttman Guest

    Re: resolve single line with multiple items into mutliple lines,single items

    >>>>> "c" == ccc31807 <> writes:

    c> As a matter of style, you can write functions that only receive
    c> arguments and return values with no side effects or assignments
    c> withing the functions, or you can write functions that make
    c> assignments and have side effects.

    it varies. in some cases a few top level lexicals are ok by me.

    c> I'm tending now to use a lot of modules, so that my 'main' program
    c> still consists of sequences of function calls, my user defined
    c> functions still avoid side effects and assignments as much as
    c> possible, and the dirty work is done in the modules. I don't
    c> particularly like this, and my style will probably continue to change.

    you can always pass in a main hash ref to keep all the top level
    stuff. as i said, it varies based on my mood and the complexity of the
    program's top level.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
     
    Uri Guttman, Apr 6, 2009
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    10,196
  2. Adam Plocher
    Replies:
    1
    Views:
    857
    Jonathan N. Little
    Jun 13, 2007
  3. Brad
    Replies:
    3
    Views:
    169
    Jacob Yang [MSFT]
    Sep 26, 2003
  4. Markus Dehmann
    Replies:
    1
    Views:
    138
    Tad McClellan
    Sep 26, 2006
  5. paviinelec
    Replies:
    0
    Views:
    296
    paviinelec
    Apr 3, 2013
Loading...

Share This Page