RDoc and encoding

Discussion in 'Ruby' started by Claus Folke Brobak, Jan 10, 2011.

  1. Hi,

    Running Ruby/JRuby 1.8.7 on Windows XP.

    Until now I have been using the RDoc version built into the Ruby
    Standard Library. That is version 1.0.1. Now I am trying out RDoc 3.4,
    installed via a gem.

    I have run into a problem with the double quote chracter. Example code:

    RDoc 1.0.1

    require 'rdoc/markup/simple_markup'
    require 'rdoc/markup/simple_markup/to_html'

    sm =3D SM::SimpleMarkup.new()
    th =3D SM::ToHtml.new()
    puts sm.convert('=C3=A6=C3=A6=C3=A6"=C3=B8=C3=B8=C3=B8"=C3=A5=C3=A5=C3=A5=
    ', th)

    Output:

    <p>
    =C3=A6=C3=A6=C3=A6&quot;=C3=B8=C3=B8=C3=B8&quot;=C3=A5=C3=A5=C3=A5
    </p>

    RDoc 3.4

    require 'rubygems'
    require 'rdoc/markup/to_html'

    puts RDoc::Markup::ToHtml.new().convert('=C3=A6=C3=A6=C3=A6"=C3=B8=C3=B8=
    =C3=B8"=C3=A5=C3=A5=C3=A5')

    Output:

    <p>=C3=A6=C3=A6=C3=A6=C3=A2=E2=82=AC=C5=93=C3=B8=C3=B8=C3=B8=C3=A2=E2=82=
    =AC=C2=9D=C3=A5=C3=A5=C3=A5</p>

    It seems as if RDoc 3.4 is adding a double quote in UTF-8 encoding
    instead of "&quot;". Running on Windows XP, the normal encoding is
    Windows-1252. If I look at the HTML and tell the browser that it is
    UTF-8 encoded, the double quotes are displayed correctly. Then, however,
    the Danish national characters (=C3=A6=C3=B8=C3=A5) are not displayed as =
    they should.

    Do you think I have hit a bug in Rdoc 3.4, or am I missing something?

    Claus

    -- =

    Posted via http://www.ruby-forum.com/.=
    Claus Folke Brobak, Jan 10, 2011
    #1
    1. Advertising

  2. Claus Folke Brobak

    Eric Hodel Guest

    On Jan 10, 2011, at 04:08, Claus Folke Brobak wrote:

    > Hi,
    >=20
    > Running Ruby/JRuby 1.8.7 on Windows XP.
    >=20
    > Until now I have been using the RDoc version built into the Ruby
    > Standard Library. That is version 1.0.1. Now I am trying out RDoc 3.4,
    > installed via a gem.
    >=20
    > I have run into a problem with the double quote chracter. Example =

    code:
    >=20
    > RDoc 1.0.1
    >=20
    > require 'rdoc/markup/simple_markup'
    > require 'rdoc/markup/simple_markup/to_html'
    >=20
    > sm =3D SM::SimpleMarkup.new()
    > th =3D SM::ToHtml.new()
    > puts sm.convert('=C3=A6=C3=A6=C3=A6"=C3=B8=C3=B8=C3=B8"=C3=A5=C3=A5=C3=A5=

    ', th)
    >=20
    > Output:
    >=20
    > <p>
    > =C3=A6=C3=A6=C3=A6&quot;=C3=B8=C3=B8=C3=B8&quot;=C3=A5=C3=A5=C3=A5
    > </p>
    >=20
    > RDoc 3.4
    >=20
    > require 'rubygems'
    > require 'rdoc/markup/to_html'
    >=20
    > puts RDoc::Markup::ToHtml.new().convert('=C3=A6=C3=A6=C3=A6"=C3=B8=C3=B8=

    =C3=B8"=C3=A5=C3=A5=C3=A5')
    >=20
    > Output:
    >=20
    > <p>=C3=A6=C3=A6=C3=A6=C3=A2=E2=82=AC=C5=93=C3=B8=C3=B8=C3=B8=C3=A2=E2=82=

    =AC=C2=9D=C3=A5=C3=A5=C3=A5</p>
    >=20
    > It seems as if RDoc 3.4 is adding a double quote in UTF-8 encoding
    > instead of "&quot;". Running on Windows XP, the normal encoding is
    > Windows-1252. If I look at the HTML and tell the browser that it is
    > UTF-8 encoded, the double quotes are displayed correctly. Then, =

    however,
    > the Danish national characters (=C3=A6=C3=B8=C3=A5) are not displayed =

    as they should.
    >=20
    > Do you think I have hit a bug in Rdoc 3.4, or am I missing something?


    Transcoding is not supported in RDoc on ruby 1.8.7. Upgrade to Ruby =
    1.9.

    My primary platform for developing RDoc is Ruby 1.9. Ruby 1.8.6 is =
    unsupported and 1.8.7 gets second tier status and will not support =
    transcoding.=
    Eric Hodel, Jan 10, 2011
    #2
    1. Advertising

  3. Eric Hodel wrote in post #973761:
    >
    > Transcoding is not supported in RDoc on ruby 1.8.7. Upgrade to Ruby
    > 1.9.


    I don't think it is a matter of transcoding. I would have thought the
    output would remain in the Windows-1252 encoding of the input.

    As I can figure out, RDoc always "thinks" the input is in UTF-8
    encoding. This is probably rarely the case on Windows.

    Can you explain the use of a double quote in UTF-8 encoding instead of
    "&quot;" in the generated HTML?

    Claus

    --
    Posted via http://www.ruby-forum.com/.
    Claus Folke Brobak, Jan 10, 2011
    #3
  4. Claus Folke Brobak

    Eric Hodel Guest

    On Jan 10, 2011, at 13:20, Claus Folke Brobak wrote:

    > Eric Hodel wrote in post #973761:
    >>=20
    >> Transcoding is not supported in RDoc on ruby 1.8.7. Upgrade to Ruby
    >> 1.9.

    >=20
    > I don't think it is a matter of transcoding. I would have thought the
    > output would remain in the Windows-1252 encoding of the input.
    >=20
    > As I can figure out, RDoc always "thinks" the input is in UTF-8
    > encoding. This is probably rarely the case on Windows.


    With Ruby 1.8 this is true. If you upgrade to Ruby 1.9 RDoc 3 can =
    automatically determine the output encoding and transcode for you. You =
    can also override it with --encoding.

    > Can you explain the use of a double quote in UTF-8 encoding instead of
    > "&quot;" in the generated HTML?


    RDoc now performs "prettier" replacements of characters such as matching =
    opening and closing quotes. Such characters are not available in all =
    output encodings so transcoding is performed.=
    Eric Hodel, Jan 10, 2011
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andreas Schwarz
    Replies:
    6
    Views:
    253
    Randy W. Sims
    Jan 1, 2004
  2. Brian Schröder
    Replies:
    5
    Views:
    136
    Dave Thomas
    Sep 18, 2004
  3. Daniel Berger
    Replies:
    1
    Views:
    143
    Dave Thomas
    Nov 2, 2004
  4. Iwan van der Kleyn

    rdoc: how to add readme.rdoc as index.html?

    Iwan van der Kleyn, Apr 26, 2005, in forum: Ruby
    Replies:
    1
    Views:
    230
    Stefan Lang
    Apr 26, 2005
  5. Paul Van Delst

    How to use rdoc parsers outside of rdoc?

    Paul Van Delst, Jul 27, 2006, in forum: Ruby
    Replies:
    0
    Views:
    104
    Paul Van Delst
    Jul 27, 2006
Loading...

Share This Page