Text::CSV problem


N

Natxo Asenjo

hi,

I need to check the status of some schedule tasks in a windows server. At
my $JOB we use nagios, so I thouth, let's write a plugin (I could not
find anything at the nagiosexchange).

windows 2k3 has a command schtasks. I can dump the status of everything
like this:

schtasks /query /fo csv /v > file.csv

the /fo switch is for the format and /v switch makes it verbose. this is
the only way to know if the task has run or not.

The output file looks like this (output truncated):

(1st line)
"HostName","TaskName","Next Run Time","Status","Logon
Mode","Last Run Time","Last Result","Creator","Schedule","Task
To Run","Start In","Comment","Scheduled Task State","Scheduled
Type","Start Time","Start Date","End Date","Days","Months","Run
As User","Delete Task If Not Rescheduled","Stop Task
If Runs X Hours and X Mins","Repeat: Every","Repeat:
Until: Time","Repeat: Until: Duration","Repeat:
Stop If Still Running","Idle Time","Power Management"

(2nd line)
"server","jobname","09:00:00, 16-10-2008","","Interactive
only","09:00:00, 07-10-2008","0","user","At 09:00
every day, starting 18-01-2008","C:\Program Files\SQLyog
Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents and
Settings\user\Application Data\SQLyog\sja.log"
-s"C:\Documents and Settings\user\Application
Data\SQLyog\sjasession.xml"","N/A","N/A","Enabled","Daily
","09:00:00","18-01-2008","N/A","Everyday","N/A","domain\administrator","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled

(no, I did not write this scheduled job)

using TEXT::CSV I can parse the first line, but it stops with the
second:

#!perl
use warnings;
use strict;
use Text::CSV;

my $csv_file = "c:/tmp/dump.csv";

open (CSV, "<", $csv_file) or die "$!\n" ;

my $csv_object = Text::CSV->new();

while (<CSV>) {
if ($csv_object->parse($_)) {
my @columns = $csv_object->fields();
print "@columns\n" ;
}
else {
my $error = $csv_object->error_diag();
print "oeps: $error\n";
}
}

again sorry, all truncated (very long lines)
C:\tmp>test.pl
HostName TaskName Next Run Time Status Logon Mode Last Run Time Last
Result Crea
tor Schedule Task To Run Start In Comment Scheduled Task State Scheduled
Type St
art Time Start Date End Date Days Months Run As User Delete Task If Not
Reschedu
led Stop Task If Runs X Hours and X Mins Repeat: Every Repeat: Until:
Time Repea
t: Until: Duration Repeat: Stop If Still Running Idle Time Power
Management
opes:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:

I think it has to do with the long paths in the task to run field,
because when I try the same code at another machine with a 'normal'
(shorter) path to run, I get the desired output.

TIA
 
Ad

Advertisements

P

Petr Vileta \(fidokomik\)

Natxo said:

Try this:

#!/usr/bin/perl
use warnings;
use strict;

my $csv_file = "c:/tmp/dump.csv";
open (CSV, "<", $csv_file) or die "$!\n" ;
while (my $line = <CSV>)
{
chomp $line; # remove \n
next unless($line); # skip blank lines
my @columns = split(',', $line);
print join(" ", @columns), "\n";
}
close CSV;
 
A

A. Sinan Unur

(2nd line)
"server","jobname","09:00:00, 16-10-2008","","Interactive
only","09:00:00, 07-10-2008","0","user","At 09:00
every day, starting 18-01-2008","C:\Program Files\SQLyog
Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents and
Settings\user\Application Data\SQLyog\sja.log"
-s"C:\Documents and Settings\user\Application
Data\SQLyog\sjasession.xml"","N/A","N/A","Enabled","Daily
","09:00:00","18-01-2008","N/A","Everyday","N/A",
....

I think it has to do with the long paths in the task to run field,
because when I try the same code at another machine with a 'normal'
(shorter) path to run, I get the desired output.

Do adopt the habit of reading the documentation for the module(s) you
are using:

http://search.cpan.org/~makamaka/Text-CSV-1.09/lib/Text/CSV.pm

<blockquote>
allow_loose_quotes

By default, parsing fields that have quote_char characters inside an
unquoted field, like

1,foo "bar" baz,42

would result in a parse error. Though it is still bad practice to
allow this format, we cannot help there are some vendors that make their
applications spit out lines styled like this.

In case there is really bad CSV data, like

1,"foo "bar" baz",42

or

1,""foo bar baz"",42

there is a way to get that parsed, and leave the quotes inside the
quoted field as-is. This can be achieved by setting allow_loose_quotes
AND making sure that the escape_char is not equal to quote_char.
</blockquote>

Sinan

--
A. Sinan Unur <[email protected]>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
 
S

sln

hi,

I need to check the status of some schedule tasks in a windows server. At
my $JOB we use nagios, so I thouth, let's write a plugin (I could not
find anything at the nagiosexchange).

windows 2k3 has a command schtasks. I can dump the status of everything
like this:

schtasks /query /fo csv /v > file.csv

the /fo switch is for the format and /v switch makes it verbose. this is
the only way to know if the task has run or not.

The output file looks like this (output truncated):

(1st line)
"HostName","TaskName","Next Run Time","Status","Logon
Mode","Last Run Time","Last Result","Creator","Schedule","Task
To Run","Start In","Comment","Scheduled Task State","Scheduled
Type","Start Time","Start Date","End Date","Days","Months","Run
As User","Delete Task If Not Rescheduled","Stop Task
If Runs X Hours and X Mins","Repeat: Every","Repeat:
Until: Time","Repeat: Until: Duration","Repeat:
Stop If Still Running","Idle Time","Power Management"

(2nd line)
"server","jobname","09:00:00, 16-10-2008","","Interactive
only","09:00:00, 07-10-2008","0","user","At 09:00
every day, starting 18-01-2008","C:\Program Files\SQLyog
Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents and
Settings\user\Application Data\SQLyog\sja.log"
-s"C:\Documents and Settings\user\Application
Data\SQLyog\sjasession.xml"","N/A","N/A","Enabled","Daily
","09:00:00","18-01-2008","N/A","Everyday","N/A","domain\administrator","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled","Disabled

(no, I did not write this scheduled job)

using TEXT::CSV I can parse the first line, but it stops with the
second:

#!perl
use warnings;
use strict;
use Text::CSV;

my $csv_file = "c:/tmp/dump.csv";

open (CSV, "<", $csv_file) or die "$!\n" ;

my $csv_object = Text::CSV->new();

while (<CSV>) {
if ($csv_object->parse($_)) {
my @columns = $csv_object->fields();
print "@columns\n" ;
}
else {
my $error = $csv_object->error_diag();
print "oeps: $error\n";
}
}

again sorry, all truncated (very long lines)
C:\tmp>test.pl
HostName TaskName Next Run Time Status Logon Mode Last Run Time Last
Result Crea
tor Schedule Task To Run Start In Comment Scheduled Task State Scheduled
Type St
art Time Start Date End Date Days Months Run As User Delete Task If Not
Reschedu
led Stop Task If Runs X Hours and X Mins Repeat: Every Repeat: Until:
Time Repea
t: Until: Duration Repeat: Stop If Still Running Idle Time Power
Management
opes:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:
oeps:

I think it has to do with the long paths in the task to run field,
because when I try the same code at another machine with a 'normal'
(shorter) path to run, I get the desired output.

TIA

You have a shit csv generator !!!
Your csv generator did not escape intended double quotes insind one of the 2nd record fields.
That field is 'Task To Run'. In addition, you are missing field 'Start In'.

This is an indication of corruption or bad generator code.
IMO, because you span lines, there is only one EOR.
That being the intersection of even double quotes and eol.
Maybe there is a pattern discernable by CSV parsers.
If there is, the easiest test is to try to bring it into Excel.
I don't see one however.

The flaw is that this:

"server","jobname","09:00:00, 16-10-2008","","Interactive
only","09:00:00, 07-10-2008","0","user","At 09:00
every day, starting 18-01-2008","C:\Program Files\SQLyog

contains the intersection of even number of "'s with eol,
but does not equal EOR.

The flaw is really in your csv generator. The even number of records
condition I let remain, and added a " at eol condition to counteract
your shit csv generator. This however, even though it will work in all
valid case, should NOT be the job of the parser, it should be the
CSV generators job! I don't like adding this condition at all.
It's not a pragmatic approach.
(see below)

sln

#############
# Csv4 Regex
#############

use strict;
use warnings;

my $fname = 'c:\temp\junk.csv';
open CSV, $fname or die "can't open $fname...";

my ($row, $tmp) = ('','');
my ($parsing, $records, $quotes) = (1,1,0);

while ($parsing)
{
## Buffer until a full row
## -------------------------
if (!($_ = <CSV>)) {
$parsing = 0; # eof, parse what's left
} else {
$tmp = $_;

## this block will trim newlines ---
$tmp =~ s/\s+$//s;
next if (!length($tmp));
$row .= " $tmp";
## ---

## this block will keep newlines ---
# $row .= $tmp;
## ---

$quotes += $tmp =~ tr/"//;

if (!($quotes % 2 == 0 && # Even number of double quotes?
$tmp =~/"$/)) # " at eol? <-- WHAT A SHIT CSV GENERATOR !!!!!!!
{ # Good to go, parse it ...
next;
}
}

print " (".$records++.") ----------\n";

## Parse the row
## -------------------
while ($row =~ /\s*"\s*(.*?)\s*"\s*,|\s*"\s*(.*?)\s*"\s*$/gs) # span lines
{
my $val = $1;
$val = $2 unless (defined $val);
# clean up double quotes
$val =~ s/""/"/g;
print "val = $val\n";
}
$row = '';
$quotes = 0;
}
close CSV;

__END__

Output:


C:\temp>perl csv4.pl
(1) ----------
val = HostName
val = TaskName
val = Next Run Time
val = Status
val = Logon Mode
val = Last Run Time
val = Last Result
val = Creator
val = Schedule
val = Task To Run
val = Start In
val = Comment
val = Scheduled Task State
val = Scheduled Type
val = Start Time
val = Start Date
val = End Date
val = Days
val = Months
val = Run As User
val = Delete Task If Not Rescheduled
val = Stop Task If Runs X Hours and X Mins
val = Repeat: Every
val = Repeat: Until: Time
val = Repeat: Until: Duration
val = Repeat: Stop If Still Running
val = Idle Time
val = Power Management
(2) ----------
val = server
val = jobname
val = 09:00:00, 16-10-2008
val =
val = Interactive only
val = 09:00:00, 07-10-2008
val = 0
val = user
val = At 09:00 every day, starting 18-01-2008
val = C:\Program Files\SQLyog Enterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Do
cuments and Settings\user\Application Data\SQLyog\sja.log" -s"C:\Documents and S
ettings\user\Application Data\SQLyog\sjasession.xml"
val = N/A
val = N/A
val = Enabled
val = Daily
val = 09:00:00
val = 18-01-2008
val = N/A
val = Everyday
val = N/A
val = domain\administrator
val = Disabled
val = Disabled
val = Disabled
val = Disabled
val = Disabled
val = Disabled
val = Disabled

C:\temp>

-----------------------

"server","jobname","09:00:00, 16-10-2008","","Interactive
only","09:00:00, 07-10-2008","0","user","At 09:00
every day, starting 18-01-2008","C:\Program Files\SQLyog

^^^^^^^^^^ this is the main culprit, even number of " with eol (but NOT EOR)

"C:\Program Files\SQLyogEnterprise\sja.exe "afdgroep_progbeh.xml" -l"C:\Documents andSettings\user\Application Data\SQLyog\sja.log"-s"C:\Documents and
Settings\user\ApplicationData\SQLyog\sjasession.xml"",

^^^^^^^^^^^ this is the other culprit, double quotes not escaped

"N/A",
"N/A",
"Enabled",
"Daily",
"09:00:00",
"18-01-2008",
"N/A",
"Everyday",
"N/A",
"domain\administrator",
"Disabled",
"Disabled",
"Disabled",
"Disabled",
"Disabled",
"Disabled",
"Disabled",
"Disabled
 
Ad

Advertisements

Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top