Block comments

Discussion in 'Python' started by MartinRinehart@gmail.com, Dec 11, 2007.

  1. Guest

    Tomorrow is block comment day. I want them to nest. I think the reason
    that they don't routinely nest is that it's a lot of trouble to code.
    Two questions:

    1) Given a start and end location (line position and char index) in an
    array of lines of text, how do you Pythonly extract the whole block
    comment? (Goal: not to have Bruno accusing me - correctly - of writing
    C in Python.)

    2) My tokenizer has a bunch of module-level constants including ones
    that define block comment starts/ends. Suppose I comment that code
    out. This is the situation:

    /* start of block comment
    ....
    BLOCK_COMMENT_END_CHARS = '*/'
    ....
    end of block comment */

    Is this the reason for """?

    (If this is a good test of tokenizer smarts, cpp and javac flunked.)
    , Dec 11, 2007
    #1
    1. Advertising

  2. a écrit :
    > Tomorrow is block comment day. I want them to nest. I think the reason
    > that they don't routinely nest is that it's a lot of trouble to code.


    Indeed.

    > Two questions:
    >
    > 1) Given a start and end location (line position and char index) in an
    > array of lines of text, how do you Pythonly extract the whole block
    > comment? (Goal: not to have Bruno accusing me - correctly - of writing
    > C in Python.)


    Is the array of lines the appropriate data structure here ?

    > 2) My tokenizer has a bunch of module-level constants including ones
    > that define block comment starts/ends. Suppose I comment that code
    > out. This is the situation:
    >
    > /* start of block comment
    > ...
    > BLOCK_COMMENT_END_CHARS = '*/'
    > ...
    > end of block comment */
    >
    > Is this the reason for """?


    Triple-quoted strings are not comments, they are a way to build
    multilines string litterals. The fact is that they are commonly used for
    doctrings - for obvious reasons - but then it's the position of this
    string litteral that makes it a docstring, not the fact that it's
    triple-quoted.

    wrt/ your above example, making it a legal construct imply that you
    should not consider the block start/end markers as comment markers if
    they are enclosed in string-litteral markers.

    Now this doesn't solve the problem of nested block comments. Here, I
    guess the solution would be to only allow fully nested block comments -
    that is, the nested block *must* be opened *and* closed within the
    parent block. In which case it should not be harder to parse than any
    other nested construct.

    While we're at it, you may not know but there are already a couple
    Python packages for building tokenizers/parsers - could it be the case
    that you're guilty of ReinventingTheSquaredWheel(tm) ?-)

    My 2 cents...
    Bruno Desthuilliers, Dec 11, 2007
    #2
    1. Advertising

  3. Guest

    Bruno Desthuilliers wrote:
    > Is the array of lines the appropriate data structure here ?


    I've done tokenizers both as an array of lines and as a long string.
    The former has seemed easier when the language treats EOL as a
    statement separator.

    re not letting literal strings in code terminate blocks, I think its
    the tokenizer-writer's job to be nice to the tokenizer users, the
    first one of which will be me, and I'll definitely have string
    literals that enclose what would otherwise be a block end marker.

    > While we're at it, you may not know but there are already a couple
    > Python packages for building tokenizers/parsers


    The tokenizer in the Python library is pretty close to what I want,
    but it returns tuples, where I want an array of Token objects. It also
    reads the source a line at a time, which seems a bit out of date.
    Maybe two or three decades out of date.

    Actually, it takes about a day to write a reasonable tokenizer. (That
    is, if you are writing using a language that you know.) Since I know
    the problem thoroughly, it seemed like a good starting point for
    learning Python.

    There's a tokenizer I wrote in java at http://www.MartinRinehart.com/src/language/Tokenizer.html
    .. Actually, that's an HTML page written by my "javasrc" (parallel to
    Sun's javadoc) based on the Tokenizer's tokenizing of its own source.

    Have I got those quotes right?
    , Dec 11, 2007
    #3
  4. a écrit :
    (snip about tokenizers - not exactly my domain, sorry)
    >
    > Have I got those quotes right?


    Perfect !-)
    Bruno Desthuilliers, Dec 12, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    1,071
  2. John Salerno

    why no block comments in Python?

    John Salerno, Mar 8, 2006, in forum: Python
    Replies:
    18
    Views:
    39,656
    Roy Smith
    Mar 11, 2006
  3. morrell
    Replies:
    1
    Views:
    926
    roy axenov
    Oct 10, 2006
  4. Monk
    Replies:
    10
    Views:
    1,415
    Michael Wojcik
    Apr 20, 2005
  5. Replies:
    4
    Views:
    558
    Dr John Stockton
    Jun 3, 2006
Loading...

Share This Page