how to remove the string part between the two marks

B

Beno

how to remove the string part between the two marks
for example
from string "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>"
extract the text "12:30"

I succeeded so I made the first split the string with the word
<SPAN ID=TIME> then again split the string with the word </ SPAN>

works but is complicated:

$ string = "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>";
@ part1 = split (/ <SPAN ID=TIME> /, $ string);
@ part2 = split (/ <\ / SPAN> /, $ part1 [0]);
print part2 [0];

I know that there is an easier way in Perl, but how?

Please help.
Thanks.

Beno
 
W

Wolf Behrenhoff

how to remove the string part between the two marks
for example
from string "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>"
extract the text "12:30"

I succeeded so I made the first split the string with the word
<SPAN ID=TIME> then again split the string with the word </ SPAN>

This looks like html.

perldoc -q html

- Wolf
 
S

smallpond

Not to remove, I need to extract the tekst 12:30
how to remove the string part between the two marks
for example
from string "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>"
extract the text "12:30"
I succeeded so I made the first split the string with the word
<SPAN ID=TIME> then again split the string with the word </ SPAN>
works but is complicated:
$ string = "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>";
@ part1 = split (/ <SPAN ID=TIME> /, $ string);
@ part2 = split (/ <\ / SPAN> /, $ part1 [0]);
print part2 [0];
I know that there is an easier way in Perl, but how?

$h = "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>";
$h =~ / \<SPAN ID=TIME> (.*) \<\/ SPAN> /;
print $1;

12:30
 
J

Jürgen Exner

Beno said:
Not to remove, I need to extract the tekst 12:30
how to remove the string part between the two marks
for example
from string "<DIV> </ DIV> <SPAN ID=TIME> 12:30 </ SPAN> <SOME TEXT>"
extract the text "12:30" [...]
I know that there is an easier way in Perl, but how?

For a very simple-minded approach use

if ($string =~ m+<SPAN ID=TIME> (.*) </SPAN>+) {
print $1;
}

Of course this will fail as soon as the SPAN tag changes, even it is
just some change in whitespace, or someone decided to enclose TIME in
quotes, or changes SPAN to lower-case, or or or....

The correct way is -as has been said a gazillion times in this NG- use
an HTML parser to parse HTML and then simply retrieve the body of the
SPAN tag.

jue
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top