how to remove <BR> using replace function?

L

localpricemaps

i have some html that looks like this


<address style="color:#">34 main,<br> Boston, MA</address>

and i am trying to use the replace function to get rid of the <Br> that
i scrape out using this code:

for oText in incident.fetchText( oRE):
strTitle += oText.strip()
strTitle = string.replace(strTitle,'<br>','')

but it doesn't seem to remove the <br>

any ideas?
 
A

Albert Leibbrandt

Rinzwind said:
Works for me.



an unfortunate in the middle




Though I don't like the 2 spaces it gives ;)
so use regex and replace both the double spaces and the <br>

cheers
albert
 
D

Duncan Booth

Rinzwind said:
Works for me.



Though I don't like the 2 spaces it gives ;)
Although I generally advise against overuse of regular expressions, this is
one situation where regular expressions might be useful: the situation is
simple enough not to warrant a parser, but apart from the whitespace a <br>
tag could have attributes or be written in xhtml style <br />. Also judging
by the inconsistency between the OP's subject line and his original
'an unfortunate in the middle'
 
B

bruno at modulix

i have some html that looks like this


<address style="color:#">34 main,<br> Boston, MA</address>

and i am trying to use the replace function to get rid of the <Br> that
i scrape out using this code:

for oText in incident.fetchText( oRE):
strTitle += oText.strip()

Why concatening ?
strTitle = string.replace(strTitle,'<br>','')

Use strTitle.replace('<br>', '') instead. And BTW, hungarian notation is
evil, so:
for text in incident.fetchText(...):
title = text.strip().replace('<br>', '')


but it doesn't seem to remove the <br>

it does :

Python 2.4.2 (#1, Feb 9 2006, 02:40:32)
[GCC 3.4.5 (Gentoo 3.4.5, ssp-3.4.5-1.0, pie-8.7.9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
The problem is obviously not with str.replace(), as you could have
figured out by yourself very easily.
any ideas?

yes: post the minimal *running* code that exhibit the problem.

Your problem is probably elsewhere, and given some of previous posts
here ('problems writing tuple to log file' and 'indentation messing up
my tuple?'), I'd say that a programming101 course should be your first
move.
 
S

Sion Arrowsmith

Duncan Booth said:
Although I generally advise against overuse of regular expressions, this is
one situation where regular expressions might be useful: [ ... ]

Agreed (on both counts), but r'\s*<br.*?>\s*' might be better
(consider what happens with "an unfortunate... <br> in the middle"
if you use \W rather than \s).
 
D

Duncan Booth

Sion said:
Duncan Booth said:
Although I generally advise against overuse of regular expressions,
this is one situation where regular expressions might be useful: [ ...
]
nobr = re.compile('\W*<br.*?>\W*', re.I)

Agreed (on both counts), but r'\s*<br.*?>\s*' might be better
(consider what happens with "an unfortunate... <br> in the middle"
if you use \W rather than \s).

Yes, I don't really know why I wrote \W when I obviously meant \s. Thanks
for correcting that.

Even better might be r'(\s*<br.*?>)+\s*' to get multiple runs of <br> tags.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,054
Latest member
LucyCarper

Latest Threads

Top