raw string tail escape revisited

Discussion in 'Python' started by Bengt Richter, Aug 9, 2003.

  1. Why wouldn't quote-stuffing solve the problem, and let you treat \ as
    an ordinary character? In a raw string, it's no good for preventing
    end-of-quoting anyway, unless you want the literal \ in front of the quote
    you are escaping.

    Quote-stuffing is a variation on the old quote-doubling, extended to
    deal with triple quotes as well (which makes it a little like HDLC bit stuffing).

    IOW, treat \ as an ordinary character, and then if you don't want the
    string to end, just stuff one quote character of the starting kind after
    the otherwise terminating sequence. You could do this with single quoting
    or triple quoting, where of course you'd need it less for triple quotes.
    E.g., using uppercase R as a prefix for this kind of raw string syntax,

    R'\' # just fine
    R'C:\' # one of the motivations
    R'''' # dumb way to do "'"
    R""" <just about anything> ->[""""]<-makes 3 quotes, and we end with \"""
    R""" ->[""""""""]<-two stuffing-extended triple quotes make 6 quotes."""

    The tokenizer would recognize a stuffed quote mark and just discard it if present,
    otherwise recognize end of string.

    Just had this idea. Do I need more coffee? What did I forget?

    Regards,
    Bengt Richter
    Bengt Richter, Aug 9, 2003
    #1
    1. Advertising

  2. Bengt Richter

    Jeff Epler Guest

    Just say "NO" Re: raw string tail escape revisited

    Well, one problem is that this is incompatible with all existing
    R-strings, which have been in Python for comparative ages. So we'd be
    forced to implement then as B'' strings (For Bengt). 16 ways to declare
    string literals (single and triple, ' and ", standard, r, u, and ur)
    are bad enough, I don't want to add another 8 (single and triple, ' and
    ", b and ub) to the mix.
    $ python -c 'import this' | grep "only one"

    Secondly, the price in the tokenizer for an R-string vs a regular string is
    essentially zero, since after the leading r, u or ur is parsed, the
    regular rule for parsing any string is used. Your rule will require
    near-duplication of a 60-line segment of Parser/tokenizer.c and a new
    function similar to PyString_DecodeEscape, probably another 60 lines of
    C.

    Finally, I'm not convinced that your description that triple-quotes and
    quote-stuffing work well together. RIght now, if the parser sees
    R'''' # dumb way to do "'"
    it'll still be in the midst of parsing a triple-quoted raw string. How
    will you be able to write a B''' string that begins with a ' if this
    rule is followed? So there must be strings that you can't write with
    B-quoting, just like there are strings you can't write with R-quoting
    (but this time the problem is with strings that start with quotes
    instead of ending with backslashes).

    Jeff
    Jeff Epler, Aug 9, 2003
    #2
    1. Advertising

  3. On 9 Aug 2003 15:33:39 GMT, (Bengt Richter) wrote:

    >Why wouldn't quote-stuffing solve the problem, and let you treat \ as
    >an ordinary character? In a raw string, it's no good for preventing
    >end-of-quoting anyway, unless you want the literal \ in front of the quote
    >you are escaping.
    >
    >Quote-stuffing is a variation on the old quote-doubling, extended to
    >deal with triple quotes as well (which makes it a little like HDLC bit stuffing).
    >
    >IOW, treat \ as an ordinary character, and then if you don't want the
    >string to end, just stuff one quote character of the starting kind after
    >the otherwise terminating sequence. You could do this with single quoting
    >or triple quoting, where of course you'd need it less for triple quotes.
    >E.g., using uppercase R as a prefix for this kind of raw string syntax,
    >
    > R'\' # just fine
    > R'C:\' # one of the motivations
    > R'''' # dumb way to do "'"

    Really dumb ;-/ That makes an un-terminated triple quoted string
    starting with one quote. D'oh. The logic doesn't start until the beginning
    delimiter - single or triple - has been passed and established. So if you
    perversely wanted to use only single quotes to quote one single quote,
    you couldn't. Is there one you couldn't do at all? I don't think so, since
    you could always do single-quote doubling and choose the opposite quote of a leading
    quote in the data. E.g., R'"""''''''' Would be a painful R'"""'+R"'''"
    Actually, that could be triple quoted as R"""""""'''""", but putting an ending '"'
    in that data would make a problem. Nope, R'''"""''''"''' would handle that.
    But what if we add another "'"? Then the data would be ["""'''"'] Still ok,
    looks like we can always start with a triple quote opposite to the end of the data:
    R"""""""'''"'""" would do it. Is there an impossible case I'm missing that would have
    to be split into two adjacent (thus concatenated) string representations?

    Is there a reasonable use case that is messed up as the price of getting R'\' ?

    Otherwise I guess it should be ok. Woke up too early and not enough ;-)


    > R""" <just about anything> ->[""""]<-makes 3 quotes, and we end with \"""
    > R""" ->[""""""""]<-two stuffing-extended triple quotes make 6 quotes."""
    >
    >The tokenizer would recognize a stuffed quote mark and just discard it if present,
    >otherwise recognize end of string.
    >
    >Just had this idea. Do I need more coffee? What did I forget?
    >
    >Regards,
    >Bengt Richter


    Regards,
    Bengt Richter
    Bengt Richter, Aug 9, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Walter L. Preuninger II

    literal escape sequence conversion to raw

    Walter L. Preuninger II, Jan 5, 2004, in forum: C Programming
    Replies:
    6
    Views:
    478
    Kevin Goodsell
    Jan 5, 2004
  2. slomo
    Replies:
    5
    Views:
    1,534
    Duncan Booth
    Dec 2, 2007
  3. walterbyrd
    Replies:
    12
    Views:
    621
    Steven D'Aprano
    May 24, 2009
  4. Just Another Victim of the Ambient Morality

    How do you get the tail end of a string?

    Just Another Victim of the Ambient Morality, Oct 30, 2009, in forum: Ruby
    Replies:
    52
    Views:
    1,218
    Robert Klemme
    Dec 1, 2009
  5. Terry Michaels

    Tail Call Optimization (Tail Recursion)

    Terry Michaels, Apr 18, 2011, in forum: Ruby
    Replies:
    16
    Views:
    314
    Robert Klemme
    Apr 20, 2011
Loading...

Share This Page