help with regex

G

George Mpouras

I must discover all possible field names of a key/value file.
The properties of the file are unknown so I must be a little creative.
The values optional can have whitespaces inside "..."
its key/value separated with a space from the next pair.
Do you thing the following is ok ?



#!/usr/bin/perl
use strict;
use warnings;

while(<DATA>) { chomp;

while ( /([^=]+)=("[^"]+"|\S+)/g ) {
my ($key, $val) = ($1, $2);
$val =~s/^["\s]*(.*?)["\s]*$/$1/;
print "*$key* *$val*\n"
}

print "--------\n"
}


__DATA__
f1=hello f2= f3="foo" f4="hello world"
f6="day" f7="day & night" f8=100
 
T

Tim McDaniel

I must discover all possible field names of a key/value file.
The properties of the file are unknown

If by "properties" you mean the layout, format, et cetera,
then how can anyone advise you on a proper way to parse it
when neither you nor we know what's valid?
 
P

Peter Gordon

I must discover all possible field names of a key/value file.
The properties of the file are unknown so I must be a little creative.
The values optional can have whitespaces inside "..."
its key/value separated with a space from the next pair.
Do you thing the following is ok ?



#!/usr/bin/perl
use strict;
use warnings;

while(<DATA>) { chomp;

while ( /([^=]+)=("[^"]+"|\S+)/g ) {
my ($key, $val) = ($1, $2);
$val =~s/^["\s]*(.*?)["\s]*$/$1/;
print "*$key* *$val*\n"
}

print "--------\n"
}


__DATA__
f1=hello f2= f3="foo" f4="hello world"
f6="day" f7="day & night" f8=100

You don't say if all possible sequences are included in the data.
If they are, the code below decodes it.

#!/usr/bin/perl -w
use strict;
use 5.14.0;
my %lines;
while( <DATA> ) {
chomp;
last if /^$/; # Catch blank lines at end of data.
while ( /(f\d+)=(.*?)(?: f\d+=|$)/ ) {
my $key = $1;
my $value = $2;
s/$key=$value(.*)/$1/; # Strip the key/value pair off the
string.
$value =~ s/"(.*)"/$1/; # Strip off any "
$lines{$key} = $value;
}
}
say "The Hash";
foreach my $key (sort keys %lines ) {
say "$key: $lines{$key}";
}
__DATA__
f1=hello f2= f3="foo" f4="hello world"
f6="day" f7="day & night" f8=100
 
G

George Mpouras

key names can be whatever string with no spaces not f\d+
f100 etc was an example, so the regex
/(f\d+)=(.*?)(?: f\d+=|$)/ )
is not catching correctly
 
G

George Mpouras

you are correct, specs are loosy
lines with multiple key/value pairs separated by space
keys are not containing space
values may contain space inside double quotes
 
T

Tim McDaniel

key names can be whatever string with no spaces not f\d+

The examples were the only spec you gave, so you can understand why
people coded to it.
f100 etc was an example

An example provided by the teacher of the class?

Settings separated by space: do you mean one space or one or more
characters of whitespace?
Is there always an "="? That is, you can't have "foo bar"; it must be
"foo= bar="?
Can there be whitespace around "=", as in "foo = bar";
Can there be leading whitespace and/or trailing whitespace on the line?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top