"Disabling" raw string to print newlines

K

kuratkull

Hello,

***************
import urllib2
import re
import string
import sys

url = "http://www.macgyver.com/"
request = urllib2.Request(url)
opener = urllib2.build_opener()
html = opener.open(request).read()

match = re.compile("<PRE>(.+)</PRE>", re.DOTALL)

out = match.findall(html)

print out
**************

I would like to print out string with formatting, but as I read, the
string is made into a raw string when using re.
How could I disable or bypass this?

I googled for an hour and couldn't find a solution.

Thank you in advance.
 
D

Diez B. Roggisch

Hello,

***************
import urllib2
import re
import string
import sys

url = "http://www.macgyver.com/"
request = urllib2.Request(url)
opener = urllib2.build_opener()
html = opener.open(request).read()

match = re.compile("<PRE>(.+)</PRE>", re.DOTALL)

out = match.findall(html)

print out
**************

I would like to print out string with formatting, but as I read, the
string is made into a raw string when using re.
How could I disable or bypass this?

You have a misconception here. A raw-string in python is *only* different as
literal - that is, you can write

r"fooo\bar"

where you'd have to write

"fooo\\bar"

with "normal" string-literals. However, the result of both is a byte-string
object that is exactly equal. So whatever out contains, it has nothing to
do with raw-string or not.

But what you probably mean is that putting out a list using print will use
the repr()-call on the contained objects. So instead of doing

print out

do

print "\n".join(out)

or such.

Diez
 
P

Paul McGuire

Hello,

***************
import urllib2
import re
import string
import sys

url = "http://www.macgyver.com/"
request = urllib2.Request(url)
opener = urllib2.build_opener()
html = opener.open(request).read()

match = re.compile("<PRE>(.+)</PRE>", re.DOTALL)

out = match.findall(html)

print out
**************

I would like to print out string with formatting, but as I read, the
string is made into a raw string when using re.
How could I disable or bypass this?

I googled for an hour and couldn't find a solution.

Thank you in advance.

Change your print statement to:

print out[0]

-- Paul
 
P

Paul McGuire

print out
**************

Since you have no control over spacing and line breaks in the input,
you can reformat using the textwrap module. First replace all "\n"s
with " ", then use re.sub to replace multiple spaces with a single
space, then call textwrap.fill to reformat the line into lines up to
'n' characters long (I chose 50 in the sample below, but you can
choose any line length you like).

out = match.findall(html)
out = out[0].replace("\n"," ")
out = re.sub("\s+"," ",out)

print textwrap.fill(out,50)


-- Paul
 
P

Paul McGuire

On May 19, 4:54 am, (e-mail address removed) wrote:> Hello,

<snip code example scraping a QOTD fromwww.mcgyver.com>


print out
**************

Since you have no control over spacing and line breaks in the input,
you can reformat using the textwrap module.  First replace all "\n"s
with " ", then use re.sub to replace multiple spaces with a single
space, then call textwrap.fill to reformat the line into lines up to
'n' characters long (I chose 50 in the sample below, but you can
choose any line length you like).

out = match.findall(html)
out = out[0].replace("\n"," ")
out = re.sub("\s+"," ",out)

print textwrap.fill(out,50)

-- Paul

One last try - .replace("\n"," ") is unnecessary, textwrap.fill takes
care of removing extra newlines already.

out = match.findall(html)
out = out[0]
out = re.sub("\s+"," ",out)

print textwrap.fill(out,50)

-- Paul
 
K

kuratkull

This worked like a charm! :)
I used Python about a year ago and I have forgotten some of its
properties.

Thanks to both of you!
-kuratkull
I would like to print out string with formatting, but as I read, the
string is made into a raw string when using re.
How could I disable or bypass this?
I googled for an hour and couldn't find a solution.
Thank you in advance.

Change your print statement to:

print out[0]

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top