[ANN] bluecloth-2.0.0 released

  • Thread starter Michael Granger
  • Start date
M

matt neuburg

Michael Granger said:
I tried to post this announcement twice yesterday, but both seem to
have gotten lost. Anyway, I'd like to announce a brand new version of
BlueCloth. You can read the announcement here:

http://deveiate.org/bluecloth2-announcement.html

Cool, but my response to this is the same as my response to RDiscount.
I'm deeply invested in years of using the original Perl Markdown. I'd
love to get off that, but in order for me to do so, it isn't enough that
a Markdown clone pass some abstract tests; it must generate HTML that is
functionally identical to the HTML that Perl Markdown generates from the
same original text. Discount, and therefore BlueCloth 2, does not.

Just to be perfectly clear, I am using a Perl script, Markdown.pl, that
is marked as follows:

$VERSION = '1.0.1';
# Tue 14 Dec 2004

So, let's proceed to some examples:

BlueCloth.new("I'm testing ").to_html
#=> =====
<p>I'm testing<br/>
</p>

That <br/> is functionally significant (it causes extra vertical
whitespace), and Perl Markdown does *not* generate it. BlueCloth is
apparently treating the extra spaces at the end of the input string as
somehow significant.

This next one is a little more involved; I'll use a here document to
display my input text:

s = <<END
* testing

pre

more li
END
puts BlueCloth.new(s).to_html
#=> ====
<ul>
<li><p>testing</p>

<pre><code>pre
</code></pre></li>
</ul>


<p> more li</p>

That's a little hard to read (I suppose I could have run it thru tidy),
but the thing to notice is that although the <pre> block is part of the
<li> block, the last <p> block is not. But here's what Perl Markdown
gives:

<ul>
<li><p>testing</p>

<pre><code>pre
</code></pre>

<p>more li</p></li>
</ul>

As you can see, the last <p> block (containing "more li") *is* part of
the <li> block. Since that is what Perl Markdown does, and since I have
lots of text that relies upon Markdown behaving in that way, I naturally
incline to the view that that is the "correct" answer and that
BlueCloth's output is "wrong".

m.
 
Y

Yossef Mendelssohn

Since that is what Perl Markdown does, and since I have
lots of text that relies upon Markdown behaving in that way, I naturally
incline to the view that that is the "correct" answer and that
BlueCloth's output is "wrong".

I don't think you necessarily need to use Perl Markdown as an
authority in this case. It seems to me that the Markdown Web Dingus
(http://daringfireball.net/projects/markdown/dingus) could be seen as
canonical, and it agrees with the output you give for Perl.

But apparently Discount gives different output (and therefore so do
rdiscount and Bluecloth), output that
satisfies MarkdownTest. So which one is correct?
 
R

Ryan Davis

BlueCloth.new("I'm testing ").to_html
#=> =====
<p>I'm testing<br/>
</p>

That <br/> is functionally significant (it causes extra vertical
whitespace), and Perl Markdown does *not* generate it. BlueCloth is
apparently treating the extra spaces at the end of the input string as
somehow significant.

It may not, but according to the markdown spec it should:

"When you do want to insert a <br /> break tag using Markdown, you end
a line with two or more spaces, then type return."
 
M

Michael Granger

Cool, but my response to this is the same as my response to RDiscount.
I'm deeply invested in years of using the original Perl Markdown. I'd
love to get off that, but in order for me to do so, it isn't enough
that
a Markdown clone pass some abstract tests; it must generate HTML
that is
functionally identical to the HTML that Perl Markdown generates from
the
same original text. Discount, and therefore BlueCloth 2, does not.

You're absolutely correct -- Discount (and therefore anything based on
it) generates HTML according to the Markdown Syntax Documentation (http://daringfireball.net/projects/markdown/syntax
) and the MarkdownTest test suite released by John Gruber (http://six.pairlist.net/pipermail/markdown-discuss/2006-June/000079.html
), and not necessarily according to what Markdown.pl generates. If
you're counting on exactly reproducing Markdown.pl's output, you
should definitely use Markdown.pl.
BlueCloth.new("I'm testing ").to_html
#=> =====
<p>I'm testing<br/>
</p>

That <br/> is functionally significant (it causes extra vertical
whitespace), and Perl Markdown does *not* generate it. BlueCloth is
apparently treating the extra spaces at the end of the input string as
somehow significant.

Right, I'm aware of BR's functional significance in HTML. The Syntax
documentation cited above states (under the Paragraphs and Line Breaks
section):

"When you do want to insert a <br /> break tag using
Markdown, you end a line with two or more spaces, then
type return."

Your test case doesn't have a trailing newline, and the Syntax
document doesn't say what should happen with a line that ends with two
spaces at the end of a document. Clearly Markdown.pl counts the
'return' part of that description as significant, and Discount does
not. Perhaps a case could be made to include a test for the break tag
rule only applying to the middle of a paragraph in MarkdownTest, and
if so I suspect David Parsons would make Discount conform to the test.
If you have a bunch of documents that end with two spaces and no
newline, I can see how you might not want to use Discount-based
transformers. I personally do not, so I view this as an anomaly and a
tradeoff I am willing to make.
This next one is a little more involved; I'll use a here document to
display my input text:

s = <<END
* testing

pre

more li
[...]
As you can see, the last <p> block (containing "more li") *is* part of
the <li> block. Since that is what Perl Markdown does, and since I
have
lots of text that relies upon Markdown behaving in that way, I
naturally
incline to the view that that is the "correct" answer and that
BlueCloth's output is "wrong".

You are certainly welcome to your own view, but again, referring to
the Syntax Documentation:

"List items may consist of multiple paragraphs. Each subsequent
paragraph in a list item must be indented by either 4 spaces
or one tab:"

Your "more li" line is *not* indented by either 4 spaces or one tab,
so I'm guessing you're counting on the "lazy" indentation of
subsequent lines of the same paragraph. To me, two blank lines and an
intervening PRE calls into question whether or not the next line is
actually part of the previous LI or not when it's indented by a single
space. I'm sure Markdown.pl agrees with your assessment, and that's
why it marks it up the way it does.

So if by "correct" you mean "does exactly what Perl Markdown does
despite what it says in the documentation", then yes, BlueCloth is
wrong. I incline to the view that a Markdown implementation should
follow what it says in the documentation and pass the test suite set
out by the creator of Markdown (the syntax), which BlueCloth does. I'm
certainly not suggesting that you should give up your reliance on
Markdown.pl's output if you don't mind forking a Perl interpreter
every time you want to transform your text to HTML.

I'm sharing my source because I've made something for myself that I
think might be useful to others. If it isn't useful to you, either
keep doing what does work for you or consider contributing some value
back to the system by providing fixes. Anything else is just sound and
fury.
 
M

matt neuburg

Michael Granger said:
Right, I'm aware of BR's functional significance in HTML. The Syntax
documentation cited above states (under the Paragraphs and Line Breaks
section):

"When you do want to insert a <br /> break tag using
Markdown, you end a line with two or more spaces, then
type return."

Your test case doesn't have a trailing newline, and the Syntax
document doesn't say what should happen with a line that ends with two
spaces at the end of a document

Right, but I was just simplifying; in real life the example did have two
trailing newlines. In other words, the way I discovered this issue was
from a situation more like the following:

require 'rubygems'
require 'bluecloth'
puts BlueCloth.new("
two spaces follow
two spaces follow

done
").to_html

Markdown puts a <br> after the first line but not after the second.
Discount puts a said:
not. Perhaps a case could be made to include a test for the break tag
rule only applying to the middle of a paragraph in MarkdownTest

Yes, I think so. But really, that case isn't worth worrying about too
much; the two spaces were actually just a mistake, and easily
eliminated.
This next one is a little more involved; I'll use a here document to
display my input text:

s = <<END
* testing

pre

more li
[...]
As you can see, the last <p> block (containing "more li") *is* part of
the <li> block. Since that is what Perl Markdown does, and since I
have
lots of text that relies upon Markdown behaving in that way, I
naturally
incline to the view that that is the "correct" answer and that
BlueCloth's output is "wrong".

You are certainly welcome to your own view, but again, referring to
the Syntax Documentation:

"List items may consist of multiple paragraphs. Each subsequent
paragraph in a list item must be indented by either 4 spaces
or one tab:"

Your "more li" line is *not* indented by either 4 spaces or one tab

That's true, but what would be really helpful is if you would tell me
what text to start with to generate the result I'm after. Here's the
schema for the desired result:

<ul><li>
<p>testing</p>
<pre><code>li</code></pre>
<p>more li</p>
</li></ul>

I can add spaces to cause the "more li" to be included in the <li>
block, but I haven't found a way to do that *and* wrap "more li" in a
<p> block with BlueCloth.

So, for example, this works the way I expect:

puts BlueCloth.new("
* testing

pre

more li

still more li

done").to_html

In that example, both "more li" and "still more li" are each wrapped in
a <p> tag and they are both within the <li> tag that started in the
first line. But if I delete "still more li", I can't get "more li" all
by itself to be wrapped in a <p> tag and within the <li> tag. So surely
Discount here disagrees with itself in a way that could be taken as
troublesome, on the basis of simple considerations of consistency. If
you eliminate the "still more li" line, I am obeying the lines you
quote: "Each subsequent paragraph in a list item must be indented by ...
4 spaces"; yet I am not getting a paragraph in the output.
out by the creator of Markdown (the syntax), which BlueCloth does. I'm
certainly not suggesting that you should give up your reliance on
Markdown.pl's output if you don't mind forking a Perl interpreter
every time you want to transform your text to HTML.

Well, I do mind it. That's why I want to switch away! But in order to do
so, I have to be able to generate the HTML I'm already generating.
If it isn't useful to you, either
keep doing what does work for you or consider contributing some value
back to the system by providing fixes. Anything else is just sound and
fury.

So there's no such thing as a conceivably legitimate bug report, and
there's no such thing as asking for help? m.
 
M

Michael Granger

So there's no such thing as a conceivably legitimate bug report, and
there's no such thing as asking for help? m.


You're right, bug reports are also valuable contributions, and asking
for help is never a bad thing. I didn't read your response as either
of those, but if it was I apologize.

I'd suggest phrasing future bug reports as "hey I found a few bugs in
your release" instead of "hey, I'm not going to use your stuff because
I prefer what I'm already using." :)

I'll open tickets for the inconsistencies you found, and try to
upstream them if I can figure out how to fix them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top