replacing tags between tags

B

beartiger

Suppose I wanted to make sure that between any:

<blockquote></blockquote>

*all* the <p>s were replaced with <br>s.

How would I do that?

E.g.:

<blockquote>
That time of year thou mayst in me behold<p>
When yellow leaves, or none, or few, do hang<p>
Upon those boughs which shake against the cold,<p>
Bare ruin'd choirs, where late the sweet birds sang.<p>
</blockquote>

Would become:

<blockquote>
That time of year thou mayst in me behold<br>
When yellow leaves, or none, or few, do hang<br>
Upon those boughs which shake against the cold,<br>
Bare ruin'd choirs, where late the sweet birds sang.<br>
</blockquote>

But all other <p>s outside of <blockquote>s would remain <p>s.

J
 
J

John Bokma

Suppose I wanted to make sure that between any:

<blockquote></blockquote>

*all* the <p>s were replaced with <br>s.

How would I do that?

Remove the <p>'s and use <pre>:

<blockquote>
<pre>
That time of year thou mayst in me behold
When yellow leaves, or none, or few, do hang
Upon those boughs which shake against the cold,
Bare ruin'd choirs, where late the sweet birds sang.
</pre>
</blockquote>

You could use s/// to do this, but it might fail. Better to parse the HTML,
fix it, and write it out.
 
B

beartiger

John said:
Remove the <p>'s and use <pre>:

<blockquote>
<pre>
That time of year thou mayst in me behold
When yellow leaves, or none, or few, do hang
Upon those boughs which shake against the cold,
Bare ruin'd choirs, where late the sweet birds sang.
</pre>
</blockquote>

You could use s/// to do this, but it might fail. Better to parse the HTML,
fix it, and write it out.

That answers the specific example, but I was looking for something to
answer the general case.


Thanks,
John
 
J

Jürgen Exner

John Bokma wrote: [...]
You could use s/// to do this, but it might fail. Better to parse
the HTML, fix it, and write it out.

That answers the specific example, but I was looking for something to
answer the general case.

Why do you think parsing the HTML would _not_ work in the general case?

jue
 
B

beartiger

Jürgen Exner said:
John Bokma wrote: [...]
You could use s/// to do this, but it might fail. Better to parse
the HTML, fix it, and write it out.

That answers the specific example, but I was looking for something to
answer the general case.

Why do you think parsing the HTML would _not_ work in the general case?

I don't. Would you please illustrate what you mean?


J
 
W

William James

Suppose I wanted to make sure that between any:

<blockquote></blockquote>

*all* the <p>s were replaced with <br>s.

How would I do that?

E.g.:

<blockquote>
That time of year thou mayst in me behold<p>
When yellow leaves, or none, or few, do hang<p>
Upon those boughs which shake against the cold,<p>
Bare ruin'd choirs, where late the sweet birds sang.<p>
</blockquote>

Would become:

<blockquote>
That time of year thou mayst in me behold<br>
When yellow leaves, or none, or few, do hang<br>
Upon those boughs which shake against the cold,<br>
Bare ruin'd choirs, where late the sweet birds sang.<br>
</blockquote>

But all other <p>s outside of <blockquote>s would remain <p>s.

J

I suggest using Ruby.

text = DATA.read
text.gsub!( %r{<blockquote>.*?</blockquote>}m ){ |str|
str.gsub( /<p>/, "<br>" )
}
puts text

__END__
<p>Now begins the plaint.</p>
<blockquote>
That time of year thou mayst in me behold<p>
When yellow leaves, or none, or few, do hang<p>
Upon those boughs which shake against the cold,<p>
Bare ruin'd choirs, where late the sweet birds sang.<p>
</blockquote>
<p>Another:</p>
<blockquote>
Stars, I have seen them fall,<p>
But when they drop and die<p>
No star is lost at all<p>
From all the star-sown sky.<p>
</blockquote>

--------------------------------------------------------------

Output:

<p>Now begins the plaint.</p>
<blockquote>
That time of year thou mayst in me behold<br>
When yellow leaves, or none, or few, do hang<br>
Upon those boughs which shake against the cold,<br>
Bare ruin'd choirs, where late the sweet birds sang.<br>
</blockquote>
<p>Another:</p>
<blockquote>
Stars, I have seen them fall,<br>
But when they drop and die<br>
No star is lost at all<br>
From all the star-sown sky.<br>
</blockquote>
 
B

beartiger

I ended up using a simple all-purpose tag parser, a la:

sub parse_tagged_text
{

my $text=shift;

my @parsed;

while($text)
{

$text=~/^(<[^>]*>|[^<]*)/gs;

if($& eq "")
{
print "<!-- parse_tagged_text looped -->\n";
print substr($text,0,50)."\n";
exit;
}

push(@parsed, $&);

$text=$';

}

@parsed;
}

Seems to work great.


Thanks to all,

J
 
J

Jürgen Exner

Jürgen Exner said:
John Bokma wrote: [...]
You could use s/// to do this, but it might fail. Better to parse
the HTML, fix it, and write it out.

That answers the specific example, but I was looking for something
to answer the general case.

Why do you think parsing the HTML would _not_ work in the general
case?

I don't. Would you please illustrate what you mean?

Well, John wrote:
<quote>Better to parse the HTML, fix it, and write it out.</quote>

You replied:
</quote>

To me that seems to imply that you do not believe that parsing the HTML
would work only for the specific example but not for the general case. If
this was not what you meant then I obviously misunderstood what you wrote.

Anyway, this topic has been discussed a gazillion times before. To parse
HTML use a proper HTML parser because contrary to popular believe parsing
HMTL is not trivial. For further details please see DejaNews and the FAQ
(perldoc -q HTML: " How do I remove HTML from a string?").

jue
 
J

Jürgen Exner

William said:
I suggest using Ruby.

Which of course is widely off topic in a Perl NG
text = DATA.read
text.gsub!( %r{<blockquote>.*?</blockquote>}m ){ |str|
str.gsub( /<p>/, "<br>" )

and fails for the same reasons as any other simple minded approach to parse
HTML using REs.

jue
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,832
Latest member
GlennSmall

Latest Threads

Top