csv parse bug...

P

Prasad Gadgil

Hi,

If I have wrongly posted this here instead of reporting this as a
module bug, pl excuse me.

I am facing an odd bug when using parse() function in Text::CSV module.
This is an absolutely textbook use of the function which fails for some
30 records out of ~ 25000 records.

I have checked, and have found a very strange pattern. In one of the
possible fields which will have embedded commas qualified by double
quotes (like "CN=XX ,OU=XX,OU=XX,DC=XX,DC=XX"), if a pattern line "
,OU" appears, the parse fails... which means if there is a space before
the embedded comma in one of the comma seperated fields, the parse
fails.

This is very puzzling. Can someone help explain.

The code where I exit from the code with parse error is pasted below...


##for each ldap record (every $ads_profile in @ads_datalines),
foreach my $ads_profile (@ads_datalines){
#print STDERR "cycling through ldap records\n";
#@temparray =
(DN,sAMAccountName,givenName,sn,telephoneNumber,department,mail)
my @temparray = ();
my $csv = Text::CSV->new;
if ($csv->parse($ads_profile)) {
@temparray = $csv->fields;
} else {
my $err = $csv->error_input;
print "parse() failed on argument: ", $err, " ", $ads_profile,
"\n";
}

......
 
A

A. Sinan Unur

If I have wrongly posted this here instead of reporting this as a
module bug, pl excuse me.

You have posted in the right group, but in the wrong way.

You need to produce a small but complete script, together with some
data, that still displays the problem.

That way, we can easily test alternative hypotheses about your problem.

Please consult the posting guidelines for this group. Read especially
about the __DATA__ filehandle.
I am facing an odd bug when using parse() function in Text::CSV
module. This is an absolutely textbook use of the function which fails
for some 30 records out of ~ 25000 records.

Obviously, we don't need 25,000 records. All we need is some data where
things work the way you expect them to, and some where they don't along
with a short but complete script we can run just by copying and pasting.

I have checked, and have found a very strange pattern. In one of the
possible fields which will have embedded commas qualified by double
quotes (like "CN=XX ,OU=XX,OU=XX,DC=XX,DC=XX"), if a pattern line "
,OU" appears, the parse fails... which means if there is a space
before the embedded comma in one of the comma seperated fields, the
parse fails.

I am not quite sure about the exact format of the data for which you
claim the parser fails. As I said, show us code, show us real data.
The code where I exit from the code with parse error is pasted
below...

But it is not a complete script we can run by copying and pasting.

Sinan
 
X

xhoster

Prasad Gadgil said:
Hi,

If I have wrongly posted this here instead of reporting this as a
module bug, pl excuse me.

I am facing an odd bug when using parse() function in Text::CSV module.
This is an absolutely textbook use of the function which fails for some
30 records out of ~ 25000 records.

I have checked, and have found a very strange pattern. In one of the
possible fields which will have embedded commas qualified by double
quotes (like "CN=XX ,OU=XX,OU=XX,DC=XX,DC=XX"), if a pattern line "
,OU" appears, the parse fails...

Do you think it might be useful for us to know what kind of failure
it reports?
which means if there is a space before
the embedded comma in one of the comma seperated fields, the parse
fails.

Not in my hands, it doesn't.

[~/perl_misc]$ perl
use Text::CSV;
my $x='foo,"CN=XX ,OU=XX,OU=XX,DC=XX,DC=XX",bar';
my $csv=Text::CSV->new();
if ($csv->parse($x)) {
print join "\t", $csv->fields();
} else {
die $csv->error_input;
};
__END__
foo CN=XX ,OU=XX,OU=XX,DC=XX,DC=XX bar


Xho
 
P

Prasad Gadgil

Hi,

Thanks Sinan, Xho.
I used the Xho's example but pasted one of the erronous entries. I find
that I do get the error. Hoever, when I printed the erronous entry on
die(), it shows a junk characher "á" at the problematic space. This is
not visible in the input data but only displayed when error is printed.
Error as follows ->

---
parse() failed on input: "CN=Gill
Abrahamá,DC=mydomain,DC=com",,name,dept,,proce
ss, at code1.pl line 10.
---

Pasting my code below. I am not sure if this mysterious invisible
character will get pasted or not. But trying all the same. How do I
eliminate such problematic characters ?

Regards,
Prasad


#!/usr/local/bin/perl
use Text::CSV;
my $x='"CN=Gill Abraham ,DC=mydomain,DC=com",,name,dept,,process,';
my $csv=Text::CSV->new();
if ($csv->parse($x)) {
print join "\t", $csv->fields();
} else {

die "parse() failed on input: ", $csv->error_input;
};
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top