Splitting and keeping key/value

Discussion in 'Perl Misc' started by Sandman, Sep 26, 2006.

  1. Sandman

    Sandman Guest

    Indata:
    -------------------------------------
    Date: 2006-04-03
    Message: Wonderful! Let's
    meet there! I'll call you
    later
    Sent by: John
    -------------------------------------

    I want this parsed into:

    Array (
    [Date] => "2006-04-03",
    [Message] => "Wonderful! Let's\nmeet there! ....."
    [Sent by] => "John"
    )

    By defining keywords that data should be split in, in this case
    "Date", "Message", "Sent by" and that those should be the first word
    on the line and they should be followed by a ":". The Message part in
    my actual indata is at no risk of containing any of these keywords.

    Any cute ideas on how to solve that? Thanks in advance. :)


    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #1
    1. Advertising

  2. Sandman

    Paul Lalli Guest

    Sandman wrote:
    > Indata:
    > -------------------------------------
    > Date: 2006-04-03
    > Message: Wonderful! Let's
    > meet there! I'll call you
    > later
    > Sent by: John
    > -------------------------------------


    Please speak Perl, not some bizarre pseudo-code. Do you mean:
    my $Indata = "Date: 2006-04-03
    Message: Wonderful! Let's
    meet there! I'll call you
    later
    Sent by: John";

    or do you mean:
    my @Indata = (
    "Date: 2006-04-03\n",
    "Message: Wonderful! Let's\n",
    "meet there! I'll call you\n",
    "later\n",
    "Sent by: John\n"
    );

    ?

    The difference is important.


    > I want this parsed into:
    >
    > Array (
    > [Date] => "2006-04-03",
    > [Message] => "Wonderful! Let's\nmeet there! ....."
    > [Sent by] => "John"
    > )


    Is this some sort of pseudo-PHP? Are you aware you posted to a Perl
    newsgroup? Do you mean you want:

    my %hash = (
    'Date' => '2006-04-03',
    'Message' => "Wonderful! Let's\nmeet there! ...",
    'Sent by' => John',
    );

    ?


    > By defining keywords that data should be split in, in this case
    > "Date", "Message", "Sent by" and that those should be the first word
    > on the line and they should be followed by a ":". The Message part in
    > my actual indata is at no risk of containing any of these keywords.
    >
    > Any cute ideas on how to solve that?


    I don't how cute it is, but yes, I could solve that using regular
    expressions. Have you made any attempts to solve it yourself yet? If
    you post your best attempt, and describe how that attempt is not
    working for you, we can probably help you fix it.

    Paul Lalli
    Paul Lalli, Sep 26, 2006
    #2
    1. Advertising

  3. Sandman

    Sandman Guest

    In article <>,
    "Paul Lalli" <> wrote:

    > Sandman wrote:
    > > Indata:
    > > -------------------------------------
    > > Date: 2006-04-03
    > > Message: Wonderful! Let's
    > > meet there! I'll call you
    > > later
    > > Sent by: John
    > > -------------------------------------

    >
    > Please speak Perl, not some bizarre pseudo-code. Do you mean:
    > my $Indata = "Date: 2006-04-03
    > Message: Wonderful! Let's
    > meet there! I'll call you
    > later
    > Sent by: John";
    >
    > or do you mean:
    > my @Indata = (
    > "Date: 2006-04-03\n",
    > "Message: Wonderful! Let's\n",
    > "meet there! I'll call you\n",
    > "later\n",
    > "Sent by: John\n"
    > );
    >
    > ?
    >
    > The difference is important.


    My indata is a textfile. Sorry.

    > > I want this parsed into:
    > >
    > > Array (
    > > [Date] => "2006-04-03",
    > > [Message] => "Wonderful! Let's\nmeet there! ....."
    > > [Sent by] => "John"
    > > )

    >
    > Is this some sort of pseudo-PHP?


    No.

    > Are you aware you posted to a Perl newsgroup?


    Yes. Did you or did you not understand the array composition I was
    looking for? If you didn't, I would be glad to explain it further as
    to avoid confusion.

    > > By defining keywords that data should be split in, in this case
    > > "Date", "Message", "Sent by" and that those should be the first word
    > > on the line and they should be followed by a ":". The Message part in
    > > my actual indata is at no risk of containing any of these keywords.
    > >
    > > Any cute ideas on how to solve that?

    >
    > I don't how cute it is, but yes, I could solve that using regular
    > expressions. Have you made any attempts to solve it yourself yet? If
    > you post your best attempt, and describe how that attempt is not
    > working for you, we can probably help you fix it.


    No, I am currently parsing it by:

    if ($body=~m/Message: (.*?)\n/){
    my $message = $1;
    }

    But I want a more modular approach.



    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #3
  4. Sandman

    Paul Lalli Guest

    Sandman wrote:
    > In article <>,
    > "Paul Lalli" <> wrote:
    >
    > > Sandman wrote:
    > > > Indata:
    > > > -------------------------------------
    > > > Date: 2006-04-03
    > > > Message: Wonderful! Let's
    > > > meet there! I'll call you
    > > > later
    > > > Sent by: John
    > > > -------------------------------------

    > >
    > > Please speak Perl, not some bizarre pseudo-code. Do you mean:
    > > my $Indata = "Date: 2006-04-03
    > > Message: Wonderful! Let's
    > > meet there! I'll call you
    > > later
    > > Sent by: John";
    > >
    > > or do you mean:
    > > my @Indata = (
    > > "Date: 2006-04-03\n",
    > > "Message: Wonderful! Let's\n",
    > > "meet there! I'll call you\n",
    > > "later\n",
    > > "Sent by: John\n"
    > > );
    > >
    > > ?
    > >
    > > The difference is important.

    >
    > My indata is a textfile. Sorry.


    That completely fails to answer the question. How are you storing this
    data *within your program*.


    > > > I want this parsed into:
    > > >
    > > > Array (
    > > > [Date] => "2006-04-03",
    > > > [Message] => "Wonderful! Let's\nmeet there! ....."
    > > > [Sent by] => "John"
    > > > )

    > >
    > > Is this some sort of pseudo-PHP?

    >
    > No.
    >
    > > Are you aware you posted to a Perl newsgroup?

    >
    > Yes. Did you or did you not understand the array composition I was
    > looking for?


    No, I can only *guess* as to what you meant. My guess may or may not
    be correct.

    > If you didn't, I would be glad to explain it further as to avoid confusion.


    To avoid confusion, just "speak Perl". That way there is no guessing.
    Show us an actual Perl data structure that is the result you are
    desiring.

    > > > By defining keywords that data should be split in, in this case
    > > > "Date", "Message", "Sent by" and that those should be the first word
    > > > on the line and they should be followed by a ":". The Message part in
    > > > my actual indata is at no risk of containing any of these keywords.
    > > >
    > > > Any cute ideas on how to solve that?

    > >
    > > I don't how cute it is, but yes, I could solve that using regular
    > > expressions. Have you made any attempts to solve it yourself yet? If
    > > you post your best attempt, and describe how that attempt is not
    > > working for you, we can probably help you fix it.

    >
    > No, I am currently parsing it by:
    >
    > if ($body=~m/Message: (.*?)\n/){
    > my $message = $1;
    > }
    >
    > But I want a more modular approach.


    Presumably, you want an approach that works, too, since the above
    doesn't. Even assuming you have more in your if() statement, which
    adds the message to your structure, that would stop $1 at the first
    line of the Message, rather than where the message actually ends.

    Consider matching all non-colons up to an internal end-of-line (take a
    look at the /m modifier for RegExps)

    Code the attempt, and let us know if it doesn't work.

    Paul Lalli
    Paul Lalli, Sep 26, 2006
    #4
  5. On 2006-09-26 12:21, Paul Lalli <> wrote:
    > Sandman wrote:
    >> In article <>,
    >> "Paul Lalli" <> wrote:
    >> > Sandman wrote:
    >> > > Indata:
    >> > > -------------------------------------
    >> > > Date: 2006-04-03
    >> > > Message: Wonderful! Let's
    >> > > meet there! I'll call you
    >> > > later
    >> > > Sent by: John
    >> > > -------------------------------------
    >> >
    >> > Please speak Perl, not some bizarre pseudo-code. Do you mean:
    >> > my $Indata = "Date: 2006-04-03
    >> > Message: Wonderful! Let's
    >> > meet there! I'll call you
    >> > later
    >> > Sent by: John";
    >> >
    >> > or do you mean:
    >> > my @Indata = (
    >> > "Date: 2006-04-03\n",
    >> > "Message: Wonderful! Let's\n",
    >> > "meet there! I'll call you\n",
    >> > "later\n",
    >> > "Sent by: John\n"
    >> > );
    >> >
    >> > ?
    >> >
    >> > The difference is important.

    >>
    >> My indata is a textfile. Sorry.

    >
    > That completely fails to answer the question. How are you storing this
    > data *within your program*.


    There is no reason why that data should be stored within the program at
    all. The file can be read line by line and the array/hash/whatever
    datastructure can be constructed on the fly. Slurping the whole file
    into memory may make constructing the desired data structure easier
    (hard to tell from the vague descriptions Sandman gave us), but it is
    certainly not required.

    hp

    --
    _ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
    |_|_) | Sysadmin WSR | > ist?
    | | | | Was sonst wäre der Sinn des Erfindens?
    __/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
    Peter J. Holzer, Sep 26, 2006
    #5
  6. Sandman

    Sandman Guest

    In article <>,
    "Paul Lalli" <> wrote:

    > > My indata is a textfile. Sorry.

    >
    > That completely fails to answer the question. How are you storing this
    > data *within your program*.


    If you don't want to help, that's fine. No need to be aggressive. The
    way it's stored within the program isn't important. If you assume it's
    stored as the content of a variable, work with that. If you don't want
    to make any assumptions, don't hit the reply button.

    I've been in this group for way too long to be bothered with people
    that rather nitpick on syntax than actually trying to help. For
    instance, a good response from you would have been something along the
    lines of:

    Well, if you have the above in, for example, $data, then I would
    probably do something like <code>

    And my reply them might have been

    Thanks, it's not a variable, but read from STDIN, but I can adapt
    ' your solution to my indata, thanks for helping me out!

    Thanks for listening.


    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #6
  7. Sandman

    Sandman Guest

    In article <3h9Sg.6492$>,
    "Mumia W. (reading news)" <>
    wrote:

    > I would use the substitution operator s/// to repeatedly suck off
    > keyword and value segments and place them in a hash. The /e option to
    > s/// allows you execute complicated expressions, and that's what I would
    > use here.
    >
    > Try it yourself.


    Yeah, that's pretty much how I've been doing it. I just thought that
    there were a more modular approach. I'll try some more. Thanks :)



    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #7
  8. Sandman

    -berlin.de Guest

    Sandman <> wrote in comp.lang.perl.misc:
    > In article <3h9Sg.6492$>,
    > "Mumia W. (reading news)" <>
    > wrote:
    >
    > > I would use the substitution operator s/// to repeatedly suck off
    > > keyword and value segments and place them in a hash. The /e option to
    > > s/// allows you execute complicated expressions, and that's what I would
    > > use here.
    > >
    > > Try it yourself.

    >
    > Yeah, that's pretty much how I've been doing it. I just thought that
    > there were a more modular approach. I'll try some more. Thanks :)


    You've said that twice now. "Modular" means consisting of independent
    components. How does that apply here?

    Anno
    -berlin.de, Sep 26, 2006
    #8
  9. Sandman

    Paul Lalli Guest

    Sandman wrote:
    > In article <>,
    > "Paul Lalli" <> wrote:
    >
    > > > My indata is a textfile. Sorry.

    > >
    > > That completely fails to answer the question. How are you storing this
    > > data *within your program*.

    >
    > If you don't want to help, that's fine. No need to be aggressive.


    While I was not being agressive, I rather disagree that there was no
    need to be. For some reason, you seem completely unwilling to help
    anyone to help you without mulitple prodding.

    > The
    > way it's stored within the program isn't important


    Of course it is. If it's stored in a scalar variable, there are
    certain operations you can do on it. If it's stored as a list of
    lines, there are other options you can do on it. How is that not
    relevant?

    >. If you assume it's
    > stored as the content of a variable, work with that. If you don't want
    > to make any assumptions, don't hit the reply button.


    My point is that there is NO REASON to make any assumptions, neither on
    my part nor on yours. You clearly are reading the file at some point
    in your current script, so why not just tell us *how* you're doing so?!

    > I've been in this group for way too long to be bothered with people
    > that rather nitpick on syntax than actually trying to help.


    I was trying to help. I was trying to help you see how to ask a
    question that would be likely to produce a response that would solve
    your problem. How is that not helpful?

    > For
    > instance, a good response from you would have been something along the
    > lines of:
    >
    > Well, if you have the above in, for example, $data, then I would
    > probably do something like <code>


    No, that would be a REALLY REALLY bad response, because it would
    encourage you to continue to post badly formed questions with no
    attempt to solve the problem on your own, and would only increase the
    number of people who refuse to help you. That would NOT help you in
    the long run at all.

    Paul Lalli
    Paul Lalli, Sep 26, 2006
    #9
  10. Sandman

    Paul Lalli Guest

    Peter J. Holzer wrote:
    > On 2006-09-26 12:21, Paul Lalli <> wrote:
    > > Sandman wrote:
    > >> In article <>,
    > >> "Paul Lalli" <> wrote:
    > >> > Sandman wrote:
    > >> > > Indata:
    > >> > > -------------------------------------
    > >> > > Date: 2006-04-03
    > >> > > Message: Wonderful! Let's
    > >> > > meet there! I'll call you
    > >> > > later
    > >> > > Sent by: John


    > >> My indata is a textfile. Sorry.

    > >
    > > That completely fails to answer the question. How are you storing this
    > > data *within your program*.

    >
    > There is no reason why that data should be stored within the program at
    > all. The file can be read line by line and the array/hash/whatever
    > datastructure can be constructed on the fly. Slurping the whole file
    > into memory may make constructing the desired data structure easier
    > (hard to tell from the vague descriptions Sandman gave us), but it is
    > certainly not required.


    I was working on the assumption that the text file really is just 5
    lines as the OP showed. In that case, the "penalty" for slurping the
    entire file is less than negligable, and the benefits of not having to
    parse each line looking for the end of the record, storing the previous
    line, joining multiple lines to complete the record, etc, are far more
    than worth it.

    Paul Lalli
    Paul Lalli, Sep 26, 2006
    #10
  11. Sandman

    Sandman Guest

    In article <>,
    -berlin.de wrote:

    > > > I would use the substitution operator s/// to repeatedly suck off
    > > > keyword and value segments and place them in a hash. The /e option to
    > > > s/// allows you execute complicated expressions, and that's what I would
    > > > use here.
    > > >
    > > > Try it yourself.

    > >
    > > Yeah, that's pretty much how I've been doing it. I just thought that
    > > there were a more modular approach. I'll try some more. Thanks :)

    >
    > You've said that twice now. "Modular" means consisting of independent
    > components. How does that apply here?


    In programming, "modular" really hasn't got a strict definition. I
    used it to mean that I could add and subtract dependencies in the
    script at will, without having to change the code.

    Plus, I'm from sweden.


    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #11
  12. Sandman

    Sandman Guest

    In article <>,
    "Paul Lalli" <> wrote:

    > > If you don't want to help, that's fine. No need to be aggressive.

    >
    > While I was not being agressive, I rather disagree that there was no
    > need to be. For some reason, you seem completely unwilling to help
    > anyone to help you without mulitple prodding.


    Ok, then leave it at that. No problem for me. Thanks anyway.


    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #12
  13. Sandman

    -berlin.de Guest

    Sandman <> wrote in comp.lang.perl.misc:
    > In article <>,
    > -berlin.de wrote:
    >
    > > > > I would use the substitution operator s/// to repeatedly suck off
    > > > > keyword and value segments and place them in a hash. The /e option to
    > > > > s/// allows you execute complicated expressions, and that's what I would
    > > > > use here.
    > > > >
    > > > > Try it yourself.
    > > >
    > > > Yeah, that's pretty much how I've been doing it. I just thought that
    > > > there were a more modular approach. I'll try some more. Thanks :)

    > >
    > > You've said that twice now. "Modular" means consisting of independent
    > > components. How does that apply here?

    >
    > In programming, "modular" really hasn't got a strict definition.


    If that is what you think then don't use the term. It can only
    add to the confusion.

    > I
    > used it to mean that I could add and subtract dependencies in the
    > script at will, without having to change the code.


    So you expect us to divine what meaning you have assigned to the
    term for the moment? Great attempt at communication!

    > Plus, I'm from sweden.


    Then don't teach us about English. The term modular has a quite
    well-defined meaning, especially in programming.

    Anno
    -berlin.de, Sep 26, 2006
    #13
  14. Sandman <> wrote:
    > In article <>,
    > "Paul Lalli" <> wrote:
    >
    >> > My indata is a textfile. Sorry.

    >>
    >> That completely fails to answer the question. How are you storing this
    >> data *within your program*.

    >
    > If you don't want to help, that's fine.



    If you don't want to be helped, that's fine too.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Sep 26, 2006
    #14
  15. Sandman

    Sandman Guest

    In article <>,
    Tad McClellan <> wrote:

    > Sandman <> wrote:
    > > In article <>,
    > > "Paul Lalli" <> wrote:
    > >
    > >> > My indata is a textfile. Sorry.
    > >>
    > >> That completely fails to answer the question. How are you storing this
    > >> data *within your program*.

    > >
    > > If you don't want to help, that's fine.

    >
    >
    > If you don't want to be helped, that's fine too.


    Indeed.


    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #15
  16. Sandman

    Sandman Guest

    In article <>,
    -berlin.de wrote:

    > > In programming, "modular" really hasn't got a strict definition.

    >
    > If that is what you think then don't use the term. It can only
    > add to the confusion.
    >
    > > I
    > > used it to mean that I could add and subtract dependencies in the
    > > script at will, without having to change the code.

    >
    > So you expect us to divine what meaning you have assigned to the
    > term for the moment? Great attempt at communication!


    I keep getting reminded of what a bunch of idiots this group harbors
    when I spend too much time in groups where people help each other out.

    *plonk*


    --
    Sandman[.net]
    Sandman, Sep 26, 2006
    #16
  17. Sandman <> wrote:

    > I keep getting reminded



    An easy way to avoid that would be to stop coming back.


    > of what a bunch of idiots this group harbors



    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Sep 26, 2006
    #17
  18. Sandman

    Sandman Guest

    In article <>,
    Tad McClellan <> wrote:

    > > I keep getting reminded

    >
    > An easy way to avoid that would be to stop coming back.


    Indeed. But I could also get lucky and come upon someone that's
    helpful. Not that big of a chance in this group, I know. But it has
    happened before.



    --
    Sandman[.net]
    Sandman, Sep 27, 2006
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. MRAB
    Replies:
    3
    Views:
    371
  2. Gary C40
    Replies:
    6
    Views:
    142
    MonkeeSage
    Dec 16, 2007
  3. Kyle Schmitt
    Replies:
    11
    Views:
    184
    William James
    May 2, 2008
  4. Antonio Quinonez
    Replies:
    2
    Views:
    156
    Antonio Quinonez
    Aug 14, 2003
  5. Sandman

    Splitting and keeping the delimiter

    Sandman, Sep 10, 2003, in forum: Perl Misc
    Replies:
    7
    Views:
    426
    Sandman
    Sep 12, 2003
Loading...

Share This Page