editing delimited record files

Discussion in 'Perl Misc' started by garth_rockett@yahoo.com, Oct 5, 2005.

  1. Guest

    I am a Perl newbie ... absolute newbie. I need to parse a validate the
    data in a text file containing delimited records. I might need to
    sanitize some of the data and edit and save the file as I read through
    it. For example, removing negative signs from integer fields,
    truncating decimal points and digits following it in fields which are
    meant to be integers, converting all different allowable date formats
    into one common format.

    I need pointers on an efficient strategy to do this. What tools can I
    look into. I would learn up what you'd suggest ... so puh-lease ...
    help!!


    Cheers,
    Andy
    , Oct 5, 2005
    #1
    1. Advertising

  2. wrote:
    > I am a Perl newbie ... absolute newbie. I need to parse a validate the
    > data in a text file containing delimited records. I might need to
    > sanitize some of the data and edit and save the file as I read through
    > it. For example, removing negative signs from integer fields,
    > truncating decimal points and digits following it in fields which are
    > meant to be integers, converting all different allowable date formats
    > into one common format.
    >
    > I need pointers on an efficient strategy to do this. What tools can I
    > look into. I would learn up what you'd suggest ... so puh-lease ...
    > help!!



    http://www.manning.com/books/cross


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Oct 5, 2005
    #2
    1. Advertising

  3. Paul Lalli Guest

    wrote:
    > I am a Perl newbie ... absolute newbie. I need to parse a validate the
    > data in a text file containing delimited records. I might need to
    > sanitize some of the data and edit and save the file as I read through
    > it. For example, removing negative signs from integer fields,
    > truncating decimal points and digits following it in fields which are
    > meant to be integers, converting all different allowable date formats
    > into one common format.
    >
    > I need pointers on an efficient strategy to do this. What tools can I
    > look into. I would learn up what you'd suggest ... so puh-lease ...
    > help!!


    http://learn.perl.org

    Once you've gotten Perl installed:
    perldoc -f open
    perldoc -f readline
    perldoc perlretut
    perldoc -f print

    Paul Lalli
    Paul Lalli, Oct 5, 2005
    #3
  4. Matt Garrish Guest

    "Paul Lalli" <> wrote in message
    news:...
    > wrote:
    >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
    >> data in a text file containing delimited records. I might need to
    >> sanitize some of the data and edit and save the file as I read through
    >> it. For example, removing negative signs from integer fields,
    >> truncating decimal points and digits following it in fields which are
    >> meant to be integers, converting all different allowable date formats
    >> into one common format.
    >>
    >> I need pointers on an efficient strategy to do this. What tools can I
    >> look into. I would learn up what you'd suggest ... so puh-lease ...
    >> help!!

    >
    > http://learn.perl.org
    >
    > Once you've gotten Perl installed:
    > perldoc -f open
    > perldoc -f readline
    > perldoc perlretut
    > perldoc -f print
    >


    A module like Text-CSV might be a better option for a newbie than pointing
    them to regular expression parsing.

    Matt
    Matt Garrish, Oct 5, 2005
    #4
  5. Paul Lalli Guest

    Matt Garrish wrote:
    > > wrote:
    > >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
    > >> data in a text file containing delimited records. I might need to
    > >> sanitize some of the data and edit and save the file as I read through
    > >> it. For example, removing negative signs from integer fields,
    > >> truncating decimal points and digits following it in fields which are
    > >> meant to be integers, converting all different allowable date formats
    > >> into one common format.

    > A module like Text-CSV might be a better option for a newbie than pointing
    > them to regular expression parsing.


    Really? Text::CSV would help with removing negative signs, truncating
    decimals, or converting date formats?

    Paul Lalli
    Paul Lalli, Oct 5, 2005
    #5
  6. Matt Garrish <> wrote:
    >
    > "Paul Lalli" <> wrote in message
    > news:...
    >> wrote:
    >>> I am a Perl newbie ... absolute newbie. I need to parse a validate the
    >>> data in a text file containing delimited records. I might need to
    >>> sanitize some of the data and edit and save the file as I read through
    >>> it. For example, removing negative signs from integer fields,
    >>> truncating decimal points and digits following it in fields which are
    >>> meant to be integers, converting all different allowable date formats
    >>> into one common format.
    >>>
    >>> I need pointers on an efficient strategy to do this. What tools can I
    >>> look into. I would learn up what you'd suggest ... so puh-lease ...
    >>> help!!

    >>
    >> http://learn.perl.org
    >>
    >> Once you've gotten Perl installed:
    >> perldoc -f open
    >> perldoc -f readline
    >> perldoc perlretut
    >> perldoc -f print
    >>

    >
    > A module like Text-CSV might be a better option for a newbie than pointing
    > them to regular expression parsing.



    Since Text::CSV won't help with the validate/sanitize requirements,
    it is entirely possible/likely that regexes will be needed too.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Oct 5, 2005
    #6
  7. Matt Garrish Guest

    "Paul Lalli" <> wrote in message
    news:...
    > Matt Garrish wrote:
    >> > wrote:
    >> >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
    >> >> data in a text file containing delimited records. I might need to
    >> >> sanitize some of the data and edit and save the file as I read through
    >> >> it. For example, removing negative signs from integer fields,
    >> >> truncating decimal points and digits following it in fields which are
    >> >> meant to be integers, converting all different allowable date formats
    >> >> into one common format.

    >> A module like Text-CSV might be a better option for a newbie than
    >> pointing
    >> them to regular expression parsing.

    >
    > Really? Text::CSV would help with removing negative signs, truncating
    > decimals, or converting date formats?
    >


    How did you get from parsing the file into chunks into performing data
    validation? I could equally well ask you how removing negative signs,
    truncating decimals and converting dates would help break up the file, but
    that sort of illogic won't get us too far will it?

    Matt
    Matt Garrish, Oct 6, 2005
    #7
  8. Paul Lalli Guest

    Matt Garrish wrote:
    > "Paul Lalli" <> wrote in message
    > news:...
    > > Matt Garrish wrote:
    > >> > wrote:
    > >> >> I am a Perl newbie ... absolute newbie. I need to parse a validate the
    > >> >> data in a text file containing delimited records. I might need to
    > >> >> sanitize some of the data and edit and save the file as I read through
    > >> >> it. For example, removing negative signs from integer fields,
    > >> >> truncating decimal points and digits following it in fields which are
    > >> >> meant to be integers, converting all different allowable date formats
    > >> >> into one common format.
    > >> A module like Text-CSV might be a better option for a newbie than
    > >> pointing
    > >> them to regular expression parsing.

    > >
    > > Really? Text::CSV would help with removing negative signs, truncating
    > > decimals, or converting date formats?
    > >

    >
    > How did you get from parsing the file into chunks into performing data
    > validation? I could equally well ask you how removing negative signs,
    > truncating decimals and converting dates would help break up the file, but
    > that sort of illogic won't get us too far will it?


    .... at least one of us is either confused or stubborn to the point of
    absurdity[1]. The OP asked for help doing two things: parsing the
    file, and validating/"sanitizing" the resulting fields, listing a few
    examples of the type of validation/sanitization he wants to accomplish.
    My response included perldocs about regular expressions, intended to
    help with the second part of the goal. You responded that I should
    have recommended Text::CSV *instead of* regular expressions. I
    questioned how Text::CSV would be able to fulfill this second goal.

    End result - the OP should look at both. Text::CSV could be used for
    parsing the data into fields, and regular expressions for validating
    the resulting fields.

    Paul Lalli

    [1] And I make no assumption that that one isn't me.
    Paul Lalli, Oct 6, 2005
    #8
  9. Guest


    > My response included perldocs about regular expressions, intended to
    > help with the second part of the goal. You responded that I should
    > have recommended Text::CSV *instead of* regular expressions. I
    > questioned how Text::CSV would be able to fulfill this second goal.


    I am comfortable with sed and grep style regular expressions, so I
    believe Perl regexps will not be such a giant leap-of-faith. I might be
    wrong but I am looking forward to using the regexps.

    >
    > End result - the OP should look at both. Text::CSV could be used for
    > parsing the data into fields, and regular expressions for validating
    > the resulting fields.


    I also saw the $^I variable, which upon being set to a string, can be
    used for inplace editing of text files, something which is actually at
    the heart of the problem. I understand that this might be more I/O but
    is there a better way to do it. My first impression was that a fair
    amount of economy of code can be achieved using $^I for inplace
    editing. Being a Perl newbie, I would like to write as much Perl as I
    can, but as little of it for Production code as I can.

    > Paul Lalli
    >
    > [1] And I make no assumption that that one isn't me.


    Thank you.

    Cheers,
    Andy
    , Oct 6, 2005
    #9
  10. Matt Garrish Guest

    "Paul Lalli" <> wrote in message
    news:...
    >
    > Matt Garrish wrote:
    >> "Paul Lalli" <> wrote in message
    >> news:...
    >> > Matt Garrish wrote:
    >> > Really? Text::CSV would help with removing negative signs, truncating
    >> > decimals, or converting date formats?
    >> >

    >>
    >> How did you get from parsing the file into chunks into performing data
    >> validation? I could equally well ask you how removing negative signs,
    >> truncating decimals and converting dates would help break up the file,
    >> but
    >> that sort of illogic won't get us too far will it?

    >
    > ... at least one of us is either confused or stubborn to the point of
    > absurdity[1]. The OP asked for help doing two things: parsing the
    > file, and validating/"sanitizing" the resulting fields, listing a few
    > examples of the type of validation/sanitization he wants to accomplish.
    > My response included perldocs about regular expressions, intended to
    > help with the second part of the goal. You responded that I should
    > have recommended Text::CSV *instead of* regular expressions. I
    > questioned how Text::CSV would be able to fulfill this second goal.
    >


    I read your first post and the implication - which I didn't think was
    intended but was there nonetheless - was that you were suggesting regular
    expressions as the means of parsing and validating (i.e., open, read line,
    regular expressions, print, with no mention of parsing). I only meant to
    point out that there exist better means of splitting a delimited file; I did
    not mean to imply that you were wrong about regular expressions as a means
    of validation if that's how you took it. Let's just assume our wires got
    crossed somewhere and leave it at that...

    Matt
    Matt Garrish, Oct 7, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark Perona
    Replies:
    2
    Views:
    384
    Duray AKAR
    Aug 12, 2003
  2. =?Utf-8?B?Z2VvZGV2?=

    Editing a record from a datagrid.

    =?Utf-8?B?Z2VvZGV2?=, Oct 19, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    1,403
    =?Utf-8?B?Z2VvZGV2?=
    Oct 20, 2005
  3. Replies:
    0
    Views:
    576
  4. RyanL
    Replies:
    6
    Views:
    671
    Paul McGuire
    Aug 28, 2007
  5. Replies:
    1
    Views:
    106
    Bob Barrows [MVP]
    Apr 27, 2007
Loading...

Share This Page