extract words from string

J

Jack S

Hi, i'm playing with perl for my first time and want to use it to extract
some words from a string.
How the string looks:
PrimeraDiv 05.09.1993 Ath._Bilbao Albacete 4-1 ( - )

Or:
PrimeraDiv 23.05.2004 Zaragoza Barcelona 2-1 ( 1-1 )

Now, i want to extract the second word, in this case a date.
The third and forth words(team names). The result and the half-time result
(between parentesis). Sometimes half-time result isn't available and i want
to replace it with the string "NA".
I know how to get the data from the file(wich is where the strings are
located):

open(LOG, "c:\\SPA_1993-1994.txt");
@indata = <LOG>;

But how can i use reg. expressions to match the words i told you? and most
important how can i extract them?

Thanks
 
P

Paul Lalli

Hi, i'm playing with perl for my first time and want to use it to extract
some words from a string.
How the string looks:
PrimeraDiv 05.09.1993 Ath._Bilbao Albacete 4-1 ( - )

Or:
PrimeraDiv 23.05.2004 Zaragoza Barcelona 2-1 ( 1-1 )

Now, i want to extract the second word, in this case a date.
The third and forth words(team names). The result and the half-time result
(between parentesis). Sometimes half-time result isn't available and i want
to replace it with the string "NA".
I know how to get the data from the file(wich is where the strings are
located):

open(LOG, "c:\\SPA_1993-1994.txt");
@indata = <LOG>;

But how can i use reg. expressions to match the words i told you? and most
important how can i extract them?

When you say "word" here, you really mean "white-space delimited fields",
yes? I would use a simple split.

(undef, $date, $team1, $team2, $score, undef, $half) = split / /, $string;
$half = "NA" unless $half =~ /\d/; #replace halftime if no numbers.

The first undef accounts for the first field you want to throw away
(PrimeraDiv). The second accounts for the opening paren you want to throw
away.

perldoc -f split
for more info on the split function

and you should probably read up on regular expressions:
perldoc perlre

Paul Lalli
 
T

Tore Aursand

How the string looks:

PrimeraDiv 05.09.1993 Ath._Bilbao Albacete 4-1 ( - )
PrimeraDiv 23.05.2004 Zaragoza Barcelona 2-1 ( 1-1 )

Now, i want to extract the second word, in this case a date.
The third and forth words(team names). The result and the half-time result
(between parentesis). Sometimes half-time result isn't available and i want
to replace it with the string "NA".

This should get you started (untested);

#!/usr/bin/perl
#
use strict;
use warnings;

open( LOG, '<', 'c:\SPA_1993-1994.txt' ) or die "$!\n";
while ( <LOG> ) {
chomp;
my ( $div, $date, $home, $away, $fulltime, $halftime ) = split( /\s+/, $_, 6 );
# ...
}
close( LOG );

Not really anything you need to spend regular expressions on.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top