PyTextile Question

J

Josh English

I am working with an XML database and have large chunks of text in certain child and grandchildren nodes.

Because I consider well-formed XML to wrap at 70 characters and indent children, I end up with a lot of extra white space in the node.text string. (I parse with ElementTree.)

I thought about using pytextile to convert this text to HTML for a nicer display option, using a wx.HTMLWindow (I don't need much in the way of fancy HTML for this application.)

However, when I convert my multiple-paragraph text object with textile, my original line breaks are preserved. Since I'm going to HTML, I d'nt want my line breaks preserved.

Example (may be munged, formatting-wise):
<pre>
<action>
<description>This is a long multi-line description
with several paragraphs and hopefully, eventually,
proper HTML P-tags.

This is a new paragraph. It should be surrounded
by its own P-tag.

Hopefully (again), I won't have a bunch of unwanted
BR tags thrown in.
</description>
</action>
</pre>

I've tried several ways of pre-processing the text in the node, but pytextile still gives me line breaks.

Any suggestions? Is there a good tutorial for PyTextile that I haven't found?

Thanks.

Josh
 
D

dinkypumpkin

However, when I convert my multiple-paragraph text object with textile, my original line breaks are preserved. Since I'm going to HTML, I d'nt want my line breaks preserved.

I think any Textile implementation will preserve line breaks within a paragraph by converting them to BR tags. Both RedCloth and PyTextile do, anyway.
I've tried several ways of pre-processing the text in the node, but pytextile still gives me line breaks.

Below is a test script that shows one way I've dealt with this issue in the past by reformatting paragraphs to remove embedded line breaks. YMMV.

import re, textile
print "INPUT1:"
s1 = """This is a long multi-line description
with several paragraphs and hopefully, eventually,
proper HTML P-tags.

This is a new paragraph. It should be surrounded
by its own P-tag.

Hopefully (again), I won't have a bunch of unwanted
BR tags thrown in."""
print(s1)
print "OUTPUT1:"
html1 = textile.textile(s1)
print(html1)
print "INPUT2:"
s2 = re.sub(r'[ \t]*\n[ \t]*(\n?)[ \t]*', r' \1\1', s1, flags=re.MULTILINE)
print(s2)
print "OUTPUT2:"
html2 = textile.textile(s2)
print(html2)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top