dealing with modified csv data format

N

nun

I wrote a script which parses csv data files which were provided to me
in the following format.

WL,WA6B,SCRAPE,N,5.99,6.85,8.56,4.98
WL,W30,RULE6"",N,7.90,47.90,6.05,3.15

Note the comma field separator and that some fields have double quotes
in the data.

My script read this data in as follows and all was well:

###############################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
################################

Now suddenly I'm being given the data in a slightly different format,
still comma separated but now all fields are enclosed by double quotes,
even those which have double quotes in the data, such as:

"WL","WA6B","SCRAPE","N","5.99","6.85","8.56","4.98"
"WL","W30","RULE6""","N","7.90","47.90","6.05","3.15"


Is there an easy way I can modify my code to deal with this new format,
or must I start over with something like TEXT::CSV instead of split? I'm
no perl guru and am hoping for an easy cure :)

DB
 
P

Paul Lalli

I wrote a script which parses csv data files which were provided to me
in the following format.

WL,WA6B,SCRAPE,N,5.99,6.85,8.56,4.98
WL,W30,RULE6"",N,7.90,47.90,6.05,3.15

Note the comma field separator and that some fields have double quotes
in the data.

My script read this data in as follows and all was well:

###############################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
################################

Now suddenly I'm being given the data in a slightly different format,
still comma separated but now all fields are enclosed by double quotes,
even those which have double quotes in the data, such as:

"WL","WA6B","SCRAPE","N","5.99","6.85","8.56","4.98"
"WL","W30","RULE6""","N","7.90","47.90","6.05","3.15"

Is there an easy way I can modify my code to deal with this new format,
or must I start over with something like TEXT::CSV instead of split? I'm
no perl guru and am hoping for an easy cure :)

Why not just keep it the way it is, but then remove all the beginning/
ending quotes from each field after you've split?

push @AoA, [ split /,/ ];
s/^"// and s/"$// for @{$AoA[-1]};

Paul Lalli
 
N

nun

Paul said:
I wrote a script which parses csv data files which were provided to me
in the following format.

WL,WA6B,SCRAPE,N,5.99,6.85,8.56,4.98
WL,W30,RULE6"",N,7.90,47.90,6.05,3.15

Note the comma field separator and that some fields have double quotes
in the data.

My script read this data in as follows and all was well:

###############################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
################################

Now suddenly I'm being given the data in a slightly different format,
still comma separated but now all fields are enclosed by double quotes,
even those which have double quotes in the data, such as:

"WL","WA6B","SCRAPE","N","5.99","6.85","8.56","4.98"
"WL","W30","RULE6""","N","7.90","47.90","6.05","3.15"

Is there an easy way I can modify my code to deal with this new format,
or must I start over with something like TEXT::CSV instead of split? I'm
no perl guru and am hoping for an easy cure :)

Why not just keep it the way it is, but then remove all the beginning/
ending quotes from each field after you've split?

push @AoA, [ split /,/ ];
s/^"// and s/"$// for @{$AoA[-1]};

Paul Lalli

That works, and fits the bill as easy - thanks!

DB
 
J

John W. Krahn

nun said:
I wrote a script which parses csv data files which were provided to me
in the following format.

WL,WA6B,SCRAPE,N,5.99,6.85,8.56,4.98
WL,W30,RULE6"",N,7.90,47.90,6.05,3.15

Note the comma field separator and that some fields have double quotes
in the data.

My script read this data in as follows and all was well:

###############################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
################################

Now suddenly I'm being given the data in a slightly different format,
still comma separated but now all fields are enclosed by double quotes,
even those which have double quotes in the data, such as:

"WL","WA6B","SCRAPE","N","5.99","6.85","8.56","4.98"
"WL","W30","RULE6""","N","7.90","47.90","6.05","3.15"


Is there an easy way I can modify my code to deal with this new format,
or must I start over with something like TEXT::CSV instead of split? I'm
no perl guru and am hoping for an easy cure :)

$ perl -le'
$_ = q[WL,WA6B,SCRAPE,N,5.99,6.85,8.56,4.98];
my @x = split /,/;
print "@x";
$_ = q["WL","WA6B","SCRAPE","N","5.99","6.85","8.56","4.98"];
my @y = map /\A"(.+)"\z/ ? $1 : $_, split /,/;
print "@y";
'
WL WA6B SCRAPE N 5.99 6.85 8.56 4.98
WL WA6B SCRAPE N 5.99 6.85 8.56 4.98




John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

dealing with large csv files 5
CSV(???) 12
CSV dB script help 9
CSV confusion newbie question 1
csv parse bug... 3
Dealing with accented characters 0
quote causing DBD::CSV failure 5
Regexp for CSV header 3

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top