[QUIZ] Quoted Printable (#23)

Discussion in 'Ruby' started by Ruby Quiz, Mar 11, 2005.

  1. Ruby Quiz

    Ruby Quiz Guest

    The three rules of Ruby Quiz:

    1. Please do not post any solutions or spoiler discussion for this quiz until
    48 hours have passed from the time on this message.

    2. Support Ruby Quiz by submitting ideas as often as you can:

    http://www.rubyquiz.com/

    3. Enjoy!

    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    The quoted printable encoding is used in primarily in email, thought it has
    recently seen some use in XML areas as well. The encoding is simple to
    translate to and from.

    This week's quiz is to build a filter that handles quoted printable translation.

    Your script should be a standard Unix filter, reading from files listed on the
    command-line or STDIN and writing to STDOUT. In normal operation, the script
    should encode all text read in the quoted printable format. However, your
    script should also support a -d command-line option and when present, text
    should be decoded from quoted printable instead. Finally, your script should
    understand a -x command-line option and when given, it should encode <, > and &
    for use with XML.

    Here are the rules we will use, from the quoted printable format:

    1. Bytes with ASCII values from 33 (exclamation point) through 60 (less
    than) and values from 62 (greater than) through 126 (tilde) should be
    passed through the encoding process unchanged. Note that the -x switch
    modifies this rule slightly, as stated above.

    2. Other bytes are to be encoded as an equals sign (=) followed by two
    hexadecimal digits. For example, when -x is active less than (<) will
    become =3C. Use only capital letters for hex digits.

    3. The exceptions are spaces and tabs. They should remain unencoded as
    long as any non-whitespace character follows them on the line. Spaces
    and tabs at the end of a line, must be encoded per rule 2 above.

    4. Native line endings should be translated to carriage return-line feed
    pairs.

    5. Quoted printable lines are limited to 76 characters of length (not
    counting the line ending pair). Longer lines must be divided up. Any
    line endings added by the encoding process should be proceeded by an
    equals sign, so the unecoder will know to remove them. The equals sign
    must be the last character on the line, followed immediately by the line
    end pair. Such an equals sign does count as a non-whitespace character
    for rule 3, allowing preceding spaces and tabs to remain unencoded.
    The equals sign must fit inside the 76 character limit.

    To unecode, just reverse the process.
     
    Ruby Quiz, Mar 11, 2005
    #1
    1. Advertising

  2. Ruby Quiz

    Glenn Parker Guest

    Re: [SOLUTION] Quoted Printable (#23)

    Note: I assumed it would be cheating to use the builtin quoted printable
    facilities.

    I found it somewhat frustrating that String#each_byte does not return
    any useful value (see encode_str).

    I found it a bit more frustrating that String#chomp! is a greedier than
    you might expect, discarding all sorts of potential line endings,
    instead of limiting itself to $/.

    I would also suggest that adding support for GetoptLong#[] to query
    options directly, instead of requiring a full iteration.



    #!/usr/bin/env ruby -w

    require 'getoptlong'

    MaxLength = 76

    def main
    opts = GetoptLong.new(
    [ "-d", GetoptLong::NO_ARGUMENT ],
    [ "-x", GetoptLong::NO_ARGUMENT ]
    )
    $opt_decode = false
    $opt_xml = false
    opts.each do |opt, arg|
    case opt
    when "-d": $opt_decode = true
    when "-x": $opt_xml = true
    end
    end

    if $opt_decode
    decode_input
    else
    encode_input
    end
    end

    def encode_input
    STDOUT.binmode # We need to control the line-endings.
    while (line = gets) do
    # Note: String#chomp! swallows more than just $/.
    line.sub!(/#{$/}$/o, "")
    # Encode the entire line.
    line.gsub!(/[^\t -<>-~]+/) { |str| encode_str(str) }
    line.gsub!(/[&<>]+/) { |str| encode_str(str) } if $opt_xml
    line.sub!(/\s*$/) { |str| encode_str(str) }
    # Split the line up as needed.
    while line.length > MaxLength
    split = line.index("=", MaxLength - 4) - 1
    split = (MaxLength - 2) if split.nil? or (split > MaxLength - 2)
    print line[0..split], "=\r\n"
    line = line[(split + 1)..-1]
    end
    print line, "\r\n"
    end
    end

    def encode_str(str)
    encoded = ""
    str.each_byte { |c| encoded << "=%02X" % c }
    encoded
    end

    def decode_input
    while (line = gets) do
    line.chomp!
    line.gsub!(/=([\dA-F]{2})/) { $1.hex.chr }
    if line[-1] == ?=
    print line[0..-2]
    else
    print line, $/
    end
    end
    end

    main


    --
    Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>
     
    Glenn Parker, Mar 13, 2005
    #2
    1. Advertising

  3. Re: [SOLUTION] Quoted Printable (#23)

    On Mar 13, 2005, at 12:57 PM, Glenn Parker wrote:

    > Note: I assumed it would be cheating to use the builtin quoted
    > printable facilities.


    I must sheepishly admit that I was unaware of of Ruby's converter when
    I made the quiz. It was pointed out the me in a private email after I
    posted it. The converter isn't a complete solution to the quiz, but it
    gets you very close.

    Is it cheating to use Ruby features? Never. Feel free, then poke a
    little fun at the quiz editor because you're smarter than he is. All
    part of the fun.

    Sorry for the oversight.

    James Edward Gray II
     
    James Edward Gray II, Mar 13, 2005
    #3
  4. Ruby Quiz

    Dave Burt Guest

    Re: [SOLUTION] Quoted Printable (#23)

    Hi,

    Testing. I found building a test suite before doing the code really helpful on
    this one, to get my head around the intricacies of the encoding. Actually
    thinking through the edge cases and working out expected results was necessary
    for me to develop this solution.

    Now, of course, this would have been a lot easier if I'd just been able to find
    the "builtin quoted printable facilities." What builtin quoted printable
    facilities?

    Anyway, here is my result:
    http://www.dave.burt.id.au/ruby/quoted-printable.rb

    And the tester:
    http://www.dave.burt.id.au/ruby/test-quoted-printable.rb

    The testing program generates test methods and test data dynamically.

    The public interface to my solution looks like this:

    module QuotedPrintable

    WHITESPACE = [?\t, ?\ ]
    WHITESPACE_REGEXP = /[\t ]/
    WHITESPACE_ESCAPED_REGEXP = /=09|=20/

    # bytes that do not need to be escaped
    PRINTABLES = ((?!..?~).to_a + WHITESPACE) - [?=]

    MAX_LINE_WIDTH = 76

    NEWLINE = "\r\n"

    # additional bytes to escape for safety in an EBCDIC document
    EBCDIC_EXCEPTIONS = %w' ! " # $ @ [ \ ] ^ ` { | } ~ '
    EBCDIC_PRINTABLES = PRINTABLES - EBCDIC_EXCEPTIONS
    # additional bytes to escape for safety in an XML document
    XML_EXCEPTIONS = %w' < > & '
    XML_PRINTABLES = PRINTABLES - XML_EXCEPTIONS

    # Encode self to the quoted-printable transfer encoding
    def to_quoted_printable(printables = QuotedPrintable::pRINTABLES)

    # Decode self from the quoted-printable transfer encoding
    def from_quoted_printable


    # Functions that do quoted-printable encoding and decoding
    class << self

    # Return the quoted-printable escaped representation of the given byte
    # (byte must be a Fixnum between 0 and 255)
    def encode_byte(byte)

    # Return the byte corresponding to the given quoted-printable escape
    # sequence as a String. If it's not valid, return nil.
    def decode_sequence(escape_sequence)

    # Return the given string encoded as quoted-printable, including the
    # canonical \r\n line terminators.
    def encode_string(string, printables = PRINTABLES)

    # Consider the given string quoted-printable encoded, and decode it,
    # including translating line terminators to the native default.
    def decode_string(string)

    # Add quoted-printable conversions to String
    class String
    include QuotedPrintable # to_quoted_printable, from_quoted_printable
    end

    Cheers,
    Dave
     
    Dave Burt, Mar 14, 2005
    #4
  5. Re: [SOLUTION] Quoted Printable (#23)

    On Mar 14, 2005, at 9:41 AM, Dave Burt wrote:

    > Now, of course, this would have been a lot easier if I'd just been
    > able to find the "builtin quoted printable facilities." What builtin
    > quoted printable facilities?


    Look up the "M" format for Array.pack.

    James Edward Gray II
     
    James Edward Gray II, Mar 14, 2005
    #5
  6. Ruby Quiz

    Dave Burt Guest

    Re: [SOLUTION] Quoted Printable (#23)

    >> What builtin quoted printable facilities?
    >
    > Look up the "M" format for Array.pack.


    So here's the cheat solution:

    class String
    def to_quoted_printable(*args)
    [self].pack("M").gsub(/\n/, "\r\n")
    end
    def from_quoted_printable
    self.gsub(/\r\n/, "\n").unpack("M").first
    end
    end

    (Just add my original if __FILE__ block to make it almost quiz-compatible)

    And here's how it fares against my test suite:

    Loaded suite TC_QuotedPrintable
    Started
    .............FF.FFFFFFF..
    Finished in 0.39 seconds.

    So it's 10 times the speed of my original one (against random binary data), but
    chops lines too early, ends up with 73- instead of 76-character lines. Of
    course, this one won't do XML.

    Interestingly, if I use a gsub! instead of a loop with sub!s in my soft_break!
    method, I get a 5x speedup... and fail the same tests.

    Cheers,
    Dave
     
    Dave Burt, Mar 15, 2005
    #6
  7. Re: [SOLUTION] Quoted Printable (#23)

    (from Dave's solution)

    if __FILE__ == $0
    require 'optparse'

    # Look, James, I'm opt-parsing! :)
    ...

    I'm so proud! :D

    James Edward Gray II
     
    James Edward Gray II, Mar 15, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Daniel Alexandre
    Replies:
    2
    Views:
    561
    Sibylle Koczian
    Mar 21, 2005
  2. Matthew Moss
    Replies:
    0
    Views:
    94
    Matthew Moss
    Mar 14, 2005
  3. Patrick Hurley

    [SOLUTION] Quoted Printable (#23)

    Patrick Hurley, Mar 15, 2005, in forum: Ruby
    Replies:
    8
    Views:
    209
    Dave Burt
    Mar 17, 2005
  4. Ruby Quiz

    [SUMMARY] Quoted Printable (#23)

    Ruby Quiz, Mar 17, 2005, in forum: Ruby
    Replies:
    0
    Views:
    105
    Ruby Quiz
    Mar 17, 2005
  5. Chris Roos
    Replies:
    1
    Views:
    165
    Jamis Buck
    Oct 6, 2005
Loading...

Share This Page