Best way to extract numeric values from a report?

Discussion in 'Perl Misc' started by bobmct, May 7, 2009.

  1. bobmct

    bobmct Guest

    I'm reasonably familiar with perl as I have been programming with it
    for a number of years.

    I am trying to extract information from a text based report file and
    create a file of data records for use as input to some other perl
    programs.

    The report file in question has about 13 columns of financial amounts
    such as -1,234,567.89 and I wish to extract each column and output to
    a record in a signed, efficient format for later use by these other
    perl programs.

    I am currently using unpack and pack but I am not having luck with the
    floating point values.

    What are some suggestions as to how to extract and export these
    values? Any good suggestions/recommendations/examples greatly
    appreciated.

    signed: just another perl monger
     
    bobmct, May 7, 2009
    #1
    1. Advertising

  2. bobmct

    J. Gleixner Guest

    bobmct wrote:
    > I'm reasonably familiar with perl as I have been programming with it
    > for a number of years.
    >
    > I am trying to extract information from a text based report file and
    > create a file of data records for use as input to some other perl
    > programs.
    >
    > The report file in question has about 13 columns of financial amounts
    > such as -1,234,567.89 and I wish to extract each column and output to
    > a record in a signed, efficient format for later use by these other
    > perl programs.


    What separates the columns?

    Use whatever that is, with split, and you should be on your way.

    >
    > I am currently using unpack and pack but I am not having luck with the
    > floating point values.


    Show us what you've tried.

    >
    > What are some suggestions as to how to extract and export these
    > values? Any good suggestions/recommendations/examples greatly
    > appreciated.



    Treating it as a string, using a regular expression, would
    be much easier than pack, IMHO.
     
    J. Gleixner, May 7, 2009
    #2
    1. Advertising

  3. bobmct <> wrote:
    >I am trying to extract information from a text based report file and
    >create a file of data records for use as input to some other perl
    >programs.
    >
    >The report file in question has about 13 columns of financial amounts
    >such as -1,234,567.89 and I wish to extract each column and output to
    >a record in a signed, efficient format for later use by these other
    >perl programs.


    How are the columns identified/separated? By fixed position within the
    line (i.e. characters 12-18 are always column 4) or by a known
    separator, e.g. a comma, a semicolon, or a space character?

    If the former then unpack is the tool of choice, if the latter then
    split() will work nicely.
    And if it's a CSV-format then one of the CSV-modules will work best,
    because they will deal with all the oddities like quoted values or
    escaped characters automatically.

    jue
     
    Jürgen Exner, May 8, 2009
    #3
  4. bobmct

    ccc31807 Guest

    On May 7, 5:48 pm, bobmct <> wrote:
    > I am trying to extract information from a text based report file and
    > create a file of data records for use as input to some other perl
    > programs.


    First, I assume that you are dealing with some kind of delimited
    format. The algorithm is easy -- here is some pseudo-pseudo code:

    open INFILE, '<', 'text_based_report_file.txt';
    while(<INFILE>)
    {
    chomp;
    my ($c1, $c2, etc. all the way to $c13) = split /delimiter/, $_;
    #you can now access your values like $c1, $c2, ... $c13)
    }
    close INFILE;

    > The report file in question has about 13 columns of financial amounts
    > such as -1,234,567.89 and I wish to extract each column and output to
    > a record in a signed, efficient format for later use by these other
    > perl programs.


    It's not clear to me exactly how you want your output, but assuming
    that it's a signed float and all you need to do is delete the commas,
    this will do it:
    $c1 =~ s/,//g;

    > What are some suggestions as to how to extract and export these
    > values?  Any good suggestions/recommendations/examples greatly
    > appreciated.


    How do you want to export the values? Seems to me that if you input
    them in a delimited file your work is done, that you don't need to
    export the values. If you need to 'export' the values into some data
    structure in memory, use an array of array references. You can then
    access each individual value like this:
    $arr_ref->[0][0] #gets the first column in the first row
    $arr_ref->[1][12] # gets the last column in the second row

    CC
     
    ccc31807, May 8, 2009
    #4
  5. On 2009-05-08 00:23, ccc31807 <> wrote:
    > On May 7, 5:48 pm, bobmct <> wrote:
    >> The report file in question has about 13 columns of financial amounts
    >> such as -1,234,567.89 and I wish to extract each column and output to
    >> a record in a signed, efficient format for later use by these other
    >> perl programs.

    >
    > It's not clear to me exactly how you want your output, but assuming
    > that it's a signed float and all you need to do is delete the commas,
    > this will do it:
    > $c1 =~ s/,//g;


    For monetary values, you often want to store the amount in the smallest
    denomination. So -1,234,567.89 should be stored as -123456789 cents, not
    as -1234567.89 Dollars. The reason for this is that decimal fractions
    cannot be represented exactly in binary, so the amount is rounded to the
    nearest multiple of a power of two (which happens to be
    -1234567.88999999989755451679229736328125 in this case).

    If the amounts are always given with two digits after the decimal point,
    you can just strip the decimal point, too:

    $c1 =~ s/[,.]//g;
    or
    $c1 =~ tr/.,//d;

    If the number of digits after the decimal point varies, you need to
    normalize them first. Be careful!

    hp
     
    Peter J. Holzer, May 8, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    5
    Views:
    943
    X-Centric
    Jun 30, 2005
  2. darrel
    Replies:
    4
    Views:
    825
    darrel
    Jul 19, 2007
  3. jobs

    int to numeric numeric(18,2) ?

    jobs, Jul 21, 2007, in forum: ASP .Net
    Replies:
    2
    Views:
    968
    =?ISO-8859-1?Q?G=F6ran_Andersson?=
    Jul 22, 2007
  4. Vishal G

    Extract Numeric values from string

    Vishal G, Sep 11, 2008, in forum: Perl Misc
    Replies:
    14
    Views:
    408
    Jürgen Exner
    Sep 12, 2008
  5. Replies:
    6
    Views:
    167
    Luca Cerone
    Sep 28, 2013
Loading...

Share This Page