Tokenizer inconsistency wrt to new lines in comments

G

George Sakkis

The tokenize.generate_tokens function seems to handle in a context-
sensitive manner the new line after a comment:
.... # hello world
.... x = (
.... # hello world
.... )
.... '''.... print repr(t[1])
....
'\n'
'# hello world\n'
'x'
'='
'('
'\n'
'# hello world'
'\n'
')'
'\n'
''

Is there a reason that the newline is included in the first comment
but not in the second, or is it a bug ?

George
 
K

Kay Schluehr

The tokenize.generate_tokens function seems to handle in a context-
sensitive manner the new line after a comment:

... # hello world
... x = (
... # hello world
... )
... '''

... print repr(t[1])
...
'\n'
'# hello world\n'
'x'
'='
'('
'\n'
'# hello world'
'\n'
')'
'\n'
''

Is there a reason that the newline is included in the first comment
but not in the second, or is it a bug ?

George

I guess it's just an artifact of handling line continuations within
expressions where a different rule is applied. For compilation
purposes both the newlines within expressions as well as the comments
are irrelevant. There are even two different token namely NEWLINE and
NL which are produced for newlines. NL and COMMENT will be ignored.
NEWLINE is relevant for the parser.

If it was a bug it has to violate a functional requirement. I can't
see which one.

Kay
 
G

George Sakkis

I guess it's just an artifact of handling line continuations within
expressions where a different rule is applied. For compilation
purposes both the newlines within expressions as well as the comments
are irrelevant. There are even two different token namely NEWLINE and
NL which are produced for newlines. NL and COMMENT will be ignored.
NEWLINE is relevant for the parser.

If it was a bug it has to violate a functional requirement. I can't
see which one.

Perhaps it's not a functional requirement but it came up as a real
problem on a source colorizer I use. I count on newlines generating
token.NEWLINE or tokenize.NL tokens in order to produce <br> tags. It
took me some time and head scratching to find out why some comments
were joined together with the following line. Now I have to check
whether a comment ends in new line and if it does output an extra <br>
tag.. it works but it's a kludge.

George
 
F

Fredrik Lundh

George said:
Perhaps it's not a functional requirement but it came up as a real
problem on a source colorizer I use. I count on newlines generating
token.NEWLINE or tokenize.NL tokens in order to produce <br> tags. It
took me some time and head scratching to find out why some comments
were joined together with the following line. Now I have to check
whether a comment ends in new line and if it does output an extra <br>
tag.. it works but it's a kludge.

well, the real kludge here is of course that you're writing your own
colorizer, when you can just go and grab Pygments:

http://pygments.org/

or, if you prefer something tiny and self-contained, something like the
colorizer module in this directory:

http://svn.effbot.org/public/stuff/sandbox/pythondoc/

(the element_colorizer module in the same directory gives you XHTML in
an ElementTree instead of raw HTML, if you want to postprocess things)

</F>
 
G

George Sakkis

well, the real kludge here is of course that you're writing your own
colorizer, when you can just go and grab Pygments:

http://pygments.org/

or, if you prefer something tiny and self-contained, something like the
colorizer module in this directory:

http://svn.effbot.org/public/stuff/sandbox/pythondoc/

(the element_colorizer module in the same directory gives you XHTML in
an ElementTree instead of raw HTML, if you want to postprocess things)

</F>

First off, I didn't write it from scratch, I just tweaked a single
module colorizer I had found online. Second, whether I or someone else
had to deal with it is irrelevant; the point is that generate_tokens()
is not consistent with respect to new lines after comments.

George
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top