Text::CSV problem

Discussion in 'Perl Misc' started by Natxo Asenjo, Oct 15, 2008.

  1. Natxo Asenjo

    Natxo Asenjo Guest

    hi,

    I need to check the status of some schedule tasks in a windows server. At
    my $JOB we use nagios, so I thouth, let's write a plugin (I could not
    find anything at the nagiosexchange).

    windows 2k3 has a command schtasks. I can dump the status of everything
    like this:

    schtasks /query /fo csv /v > file.csv

    the /fo switch is for the format and /v switch makes it verbose. this is
    the only way to know if the task has run or not.

    The output file looks like this (output truncated):

    (1st line)
    "HostName","TaskName","Next Run Time","Status","Logon
    Mode","Last Run Time","Last Result","Creator","Schedule","Task
    To Run","Start In","Comment","Scheduled Task State","Scheduled
    Type","Start Time","Start Date","End Date","Days","Months","Run
    As User","Delete Task If Not Rescheduled","Stop Task
    If Runs X Hours and X Mins","Repeat: Every","Repeat:
    Until: Time","Repeat: Until: Duration","Repeat:
    Stop If Still Running","Idle Time","Power Management"

    (2nd line)
    "server","jobname","09:00:00, 16-10-2008","","Interactive
    only","09:00:00, 07-10-2008","0","user","At 09:00
    every day, starting 18-01-2008","C:\Program Files\SQLyog
    Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents and
    Settings\user\Application Data\SQLyog\sja.log"
    -s"C:\Documents and Settings\user\Application
    Data\SQLyog\sjasession.xml"","N/A","N/A","Enabled","Daily
    ","09:00:00","18-01-2008","N/A","Everyday","N/A","domain\administrator","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled

    (no, I did not write this scheduled job)

    using TEXT::CSV I can parse the first line, but it stops with the
    second:

    #!perl
    use warnings;
    use strict;
    use Text::CSV;

    my $csv_file = "c:/tmp/dump.csv";

    open (CSV, "<", $csv_file) or die "$!\n" ;

    my $csv_object = Text::CSV->new();

    while (<CSV>) {
    if ($csv_object->parse($_)) {
    my @columns = $csv_object->fields();
    print "@columns\n" ;
    }
    else {
    my $error = $csv_object->error_diag();
    print "oeps: $error\n";
    }
    }

    again sorry, all truncated (very long lines)
    C:\tmp>test.pl
    HostName TaskName Next Run Time Status Logon Mode Last Run Time Last
    Result Crea
    tor Schedule Task To Run Start In Comment Scheduled Task State Scheduled
    Type St
    art Time Start Date End Date Days Months Run As User Delete Task If Not
    Reschedu
    led Stop Task If Runs X Hours and X Mins Repeat: Every Repeat: Until:
    Time Repea
    t: Until: Duration Repeat: Stop If Still Running Idle Time Power
    Management
    opes:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:
    oeps:

    I think it has to do with the long paths in the task to run field,
    because when I try the same code at another machine with a 'normal'
    (shorter) path to run, I get the desired output.

    TIA
    --
    Groeten,
    J.I.Asenjo
    Natxo Asenjo, Oct 15, 2008
    #1
    1. Advertising

  2. Natxo Asenjo wrote:
    > hi,
    >


    Try this:

    #!/usr/bin/perl
    use warnings;
    use strict;

    my $csv_file = "c:/tmp/dump.csv";
    open (CSV, "<", $csv_file) or die "$!\n" ;
    while (my $line = <CSV>)
    {
    chomp $line; # remove \n
    next unless($line); # skip blank lines
    my @columns = split(',', $line);
    print join(" ", @columns), "\n";
    }
    close CSV;

    --
    Petr Vileta, Czech republic
    (My server rejects all messages from Yahoo and Hotmail.
    Send me your mail from another non-spammer site please.)
    Please reply to <petr AT practisoft DOT cz>
    Petr Vileta \(fidokomik\), Oct 15, 2008
    #2
    1. Advertising

  3. Natxo Asenjo <> wrote in
    news:48f5e447$0$183$4all.nl:

    > (2nd line)
    > "server","jobname","09:00:00, 16-10-2008","","Interactive
    > only","09:00:00, 07-10-2008","0","user","At 09:00
    > every day, starting 18-01-2008","C:\Program Files\SQLyog
    > Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents and
    > Settings\user\Application Data\SQLyog\sja.log"
    > -s"C:\Documents and Settings\user\Application
    > Data\SQLyog\sjasession.xml"","N/A","N/A","Enabled","Daily
    > ","09:00:00","18-01-2008","N/A","Everyday","N/A",


    ....

    > I think it has to do with the long paths in the task to run field,
    > because when I try the same code at another machine with a 'normal'
    > (shorter) path to run, I get the desired output.


    Do adopt the habit of reading the documentation for the module(s) you
    are using:

    http://search.cpan.org/~makamaka/Text-CSV-1.09/lib/Text/CSV.pm

    <blockquote>
    allow_loose_quotes

    By default, parsing fields that have quote_char characters inside an
    unquoted field, like

    1,foo "bar" baz,42

    would result in a parse error. Though it is still bad practice to
    allow this format, we cannot help there are some vendors that make their
    applications spit out lines styled like this.

    In case there is really bad CSV data, like

    1,"foo "bar" baz",42

    or

    1,""foo bar baz"",42

    there is a way to get that parsed, and leave the quotes inside the
    quoted field as-is. This can be achieved by setting allow_loose_quotes
    AND making sure that the escape_char is not equal to quote_char.
    </blockquote>

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://www.rehabitation.com/clpmisc/
    A. Sinan Unur, Oct 15, 2008
    #3
  4. Natxo Asenjo

    Guest

    On 15 Oct 2008 12:38:31 GMT, Natxo Asenjo <> wrote:

    >hi,
    >
    >I need to check the status of some schedule tasks in a windows server. At
    >my $JOB we use nagios, so I thouth, let's write a plugin (I could not
    >find anything at the nagiosexchange).
    >
    >windows 2k3 has a command schtasks. I can dump the status of everything
    >like this:
    >
    >schtasks /query /fo csv /v > file.csv
    >
    >the /fo switch is for the format and /v switch makes it verbose. this is
    >the only way to know if the task has run or not.
    >
    >The output file looks like this (output truncated):
    >
    >(1st line)
    >"HostName","TaskName","Next Run Time","Status","Logon
    >Mode","Last Run Time","Last Result","Creator","Schedule","Task
    >To Run","Start In","Comment","Scheduled Task State","Scheduled
    >Type","Start Time","Start Date","End Date","Days","Months","Run
    >As User","Delete Task If Not Rescheduled","Stop Task
    >If Runs X Hours and X Mins","Repeat: Every","Repeat:
    >Until: Time","Repeat: Until: Duration","Repeat:
    >Stop If Still Running","Idle Time","Power Management"
    >
    >(2nd line)
    >"server","jobname","09:00:00, 16-10-2008","","Interactive
    >only","09:00:00, 07-10-2008","0","user","At 09:00
    >every day, starting 18-01-2008","C:\Program Files\SQLyog
    >Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents and
    >Settings\user\Application Data\SQLyog\sja.log"
    >-s"C:\Documents and Settings\user\Application
    >Data\SQLyog\sjasession.xml"","N/A","N/A","Enabled","Daily
    >","09:00:00","18-01-2008","N/A","Everyday","N/A","domain\administrator","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled
    >
    >(no, I did not write this scheduled job)
    >
    >using TEXT::CSV I can parse the first line, but it stops with the
    >second:
    >
    >#!perl
    >use warnings;
    >use strict;
    >use Text::CSV;
    >
    >my $csv_file = "c:/tmp/dump.csv";
    >
    >open (CSV, "<", $csv_file) or die "$!\n" ;
    >
    >my $csv_object = Text::CSV->new();
    >
    >while (<CSV>) {
    > if ($csv_object->parse($_)) {
    > my @columns = $csv_object->fields();
    > print "@columns\n" ;
    > }
    > else {
    > my $error = $csv_object->error_diag();
    > print "oeps: $error\n";
    > }
    >}
    >
    >again sorry, all truncated (very long lines)
    >C:\tmp>test.pl
    >HostName TaskName Next Run Time Status Logon Mode Last Run Time Last
    >Result Crea
    >tor Schedule Task To Run Start In Comment Scheduled Task State Scheduled
    >Type St
    >art Time Start Date End Date Days Months Run As User Delete Task If Not
    >Reschedu
    >led Stop Task If Runs X Hours and X Mins Repeat: Every Repeat: Until:
    >Time Repea
    >t: Until: Duration Repeat: Stop If Still Running Idle Time Power
    >Management
    >opes:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >oeps:
    >
    >I think it has to do with the long paths in the task to run field,
    >because when I try the same code at another machine with a 'normal'
    >(shorter) path to run, I get the desired output.
    >
    >TIA


    You have a shit csv generator !!!
    Your csv generator did not escape intended double quotes insind one of the 2nd record fields.
    That field is 'Task To Run'. In addition, you are missing field 'Start In'.

    This is an indication of corruption or bad generator code.
    IMO, because you span lines, there is only one EOR.
    That being the intersection of even double quotes and eol.
    Maybe there is a pattern discernable by CSV parsers.
    If there is, the easiest test is to try to bring it into Excel.
    I don't see one however.

    The flaw is that this:

    "server","jobname","09:00:00, 16-10-2008","","Interactive
    only","09:00:00, 07-10-2008","0","user","At 09:00
    every day, starting 18-01-2008","C:\Program Files\SQLyog

    contains the intersection of even number of "'s with eol,
    but does not equal EOR.

    The flaw is really in your csv generator. The even number of records
    condition I let remain, and added a " at eol condition to counteract
    your shit csv generator. This however, even though it will work in all
    valid case, should NOT be the job of the parser, it should be the
    CSV generators job! I don't like adding this condition at all.
    It's not a pragmatic approach.
    (see below)

    sln

    #############
    # Csv4 Regex
    #############

    use strict;
    use warnings;

    my $fname = 'c:\temp\junk.csv';
    open CSV, $fname or die "can't open $fname...";

    my ($row, $tmp) = ('','');
    my ($parsing, $records, $quotes) = (1,1,0);

    while ($parsing)
    {
    ## Buffer until a full row
    ## -------------------------
    if (!($_ = <CSV>)) {
    $parsing = 0; # eof, parse what's left
    } else {
    $tmp = $_;

    ## this block will trim newlines ---
    $tmp =~ s/\s+$//s;
    next if (!length($tmp));
    $row .= " $tmp";
    ## ---

    ## this block will keep newlines ---
    # $row .= $tmp;
    ## ---

    $quotes += $tmp =~ tr/"//;

    if (!($quotes % 2 == 0 && # Even number of double quotes?
    $tmp =~/"$/)) # " at eol? <-- WHAT A SHIT CSV GENERATOR !!!!!!!
    { # Good to go, parse it ...
    next;
    }
    }

    print " (".$records++.") ----------\n";

    ## Parse the row
    ## -------------------
    while ($row =~ /\s*"\s*(.*?)\s*"\s*,|\s*"\s*(.*?)\s*"\s*$/gs) # span lines
    {
    my $val = $1;
    $val = $2 unless (defined $val);
    # clean up double quotes
    $val =~ s/""/"/g;
    print "val = $val\n";
    }
    $row = '';
    $quotes = 0;
    }
    close CSV;

    __END__

    Output:


    C:\temp>perl csv4.pl
    (1) ----------
    val = HostName
    val = TaskName
    val = Next Run Time
    val = Status
    val = Logon Mode
    val = Last Run Time
    val = Last Result
    val = Creator
    val = Schedule
    val = Task To Run
    val = Start In
    val = Comment
    val = Scheduled Task State
    val = Scheduled Type
    val = Start Time
    val = Start Date
    val = End Date
    val = Days
    val = Months
    val = Run As User
    val = Delete Task If Not Rescheduled
    val = Stop Task If Runs X Hours and X Mins
    val = Repeat: Every
    val = Repeat: Until: Time
    val = Repeat: Until: Duration
    val = Repeat: Stop If Still Running
    val = Idle Time
    val = Power Management
    (2) ----------
    val = server
    val = jobname
    val = 09:00:00, 16-10-2008
    val =
    val = Interactive only
    val = 09:00:00, 07-10-2008
    val = 0
    val = user
    val = At 09:00 every day, starting 18-01-2008
    val = C:\Program Files\SQLyog Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Do
    cuments and Settings\user\Application Data\SQLyog\sja.log" -s"C:\Documents and S
    ettings\user\Application Data\SQLyog\sjasession.xml"
    val = N/A
    val = N/A
    val = Enabled
    val = Daily
    val = 09:00:00
    val = 18-01-2008
    val = N/A
    val = Everyday
    val = N/A
    val = domain\administrator
    val = Disabled
    val = Disabled
    val = Disabled
    val = Disabled
    val = Disabled
    val = Disabled
    val = Disabled

    C:\temp>

    -----------------------

    "server","jobname","09:00:00, 16-10-2008","","Interactive
    only","09:00:00, 07-10-2008","0","user","At 09:00
    every day, starting 18-01-2008","C:\Program Files\SQLyog

    ^^^^^^^^^^ this is the main culprit, even number of " with eol (but NOT EOR)

    "C:\Program Files\SQLyogEnterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents andSettings\user\Application Data\SQLyog\sja.log"-s"C:\Documents and
    Settings\user\ApplicationData\SQLyog\sjasession.xml"",

    ^^^^^^^^^^^ this is the other culprit, double quotes not escaped

    "N/A",
    "N/A",
    "Enabled",
    "Daily",
    "09:00:00",
    "18-01-2008",
    "N/A",
    "Everyday",
    "N/A",
    "domain\administrator",
    "Disabled",
    "Disabled",
    "Disabled",
    "Disabled",
    "Disabled",
    "Disabled",
    "Disabled",
    "Disabled
    , Oct 15, 2008
    #4
  5. <> wrote:


    > $tmp =~ s/\s+$//s;

    ^
    ^

    What is the point of the s modifier here?


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, Oct 16, 2008
    #5
  6. Natxo Asenjo

    Guest

    On Wed, 15 Oct 2008 20:22:15 -0500, Tad J McClellan <> wrote:

    > <> wrote:
    >
    >
    >> $tmp =~ s/\s+$//s;

    > ^
    > ^
    >
    >What is the point of the s modifier here?


    It is a typo. \s includes newline. You know that
    , Oct 16, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michal Mikolajczyk
    Replies:
    0
    Views:
    647
    Michal Mikolajczyk
    Feb 13, 2004
  2. Skip Montanaro
    Replies:
    0
    Views:
    711
    Skip Montanaro
    Feb 13, 2004
  3. Tintin92
    Replies:
    1
    Views:
    1,698
    Andrew Thompson
    Feb 14, 2007
  4. jliu66
    Replies:
    0
    Views:
    506
    jliu66
    Oct 19, 2007
  5. sso
    Replies:
    20
    Views:
    2,657
    Martin Gregorie
    Apr 26, 2009
Loading...

Share This Page