Too much code - slicing

S

Steven D'Aprano

On 09/18/2010 11:28 PM, Steven D'Aprano wrote: [...]
My wife can read scarily fast. It's very something to watch her reading
pages as fast as she can turn them, and a few years ago she read the
entire Harry Potter series (to date) in one afternoon, and could gives
a blow-by-blow account of the plots, including a detailed critique of
the writing style and characters. But then, she feels that reading the
Potter series is a chore to be completed as fast as possible, rather
than a pleasure to be savored. She'll sometimes drag a new Terry
Pratchett or Stephen King novel out for as much as two days.
That's pretty impressive. I used to get somewhat close to that speed
when, years ago, I'd read a lot of trashy scifi. [...]
In other spots, I'd
be able to scan a few words at the top of page, a few in the middle and
at the bottom and I'd know what's going on, generally.

I don't know about how other people speed-read, but I can assure you that
when my wife speed-reads, she's not just scanning a few words and
interpolating between them. She can give you a detailed summary of what
*actually* happened, not just a good guess. Including pointing out any
spelling or grammatical errors and clumsy writing. *Especially* the
spelling errors, they have about the same effect on her reading speed as
a tree trunk lying across a Formula 1 race track.
 
S

Steven D'Aprano

No, but the syntax should be invisible. When I read English, I don't
have to think about nouns and verbs and such unless something is very
badly written.

That's almost certainly because you've been listening to, speaking,
reading and writing English since you were a small child, and the syntax
and grammar of English is buried deep in your brain.

And you certainly do think about nouns and verbs, you just don't
*consciously* think about them. If I write:

"Susan blooged the mobblet."

you will probably recognise "bloog" as the verb and "mobblet" as the
noun, even though you've almost certainly never seen those words before
and have no idea what they mean. But if I write this:

"Susan is mobblet the blooged."

you'll probably give a double-take. The words don't look right for
English grammar and syntax.

I've been reading, writing and thinking in Python for well over a decade.
The syntax and grammar is almost entirely invisible to me too. No
surprise there -- they are relatively close to that of the human
languages I'm used to (English). But if I were a native Chinese or Arabic
speaker, I'd probably find Python much less "natural" and *would* need to
explicitly think about the syntax more.


[...]
I've never seen this. I've seen things highlight comments and keywords
and operators and constants and identifiers differently.

Exactly. Things are highlighted because of *what* they are, not because
of the syntax they use or because of the grammatical role they play.

In a Python expression like:

y = none or None

an editor might colour "None" green because it's a known keyword, but
"none" black because it's a variable. If you change the syntax:

y = None if [none][0] is None else {None: none}[None]

the colours remain the same. None is coloured green not because of
*where* it is in the syntax tree, but because of *what* it is. Calling
this "syntax highlighting" is misleading, or at least incomplete.

Eww. (I had not yet gotten to the point of finding out that whether
something was "built-in" or not substantially affected its semantics.)

In some languages, built-in functions truly are special, e.g. they are
reserved words. That's not the case for Python. Nevertheless, the editors
I've used treat built-ins as "pseudo-reserved words" and colourise them.

Punctuation is very different from highlighting, IMHO. That said, I
find punctuation very effective at being small and discrete, clearly not
words, and easy to pick out. Color cues are not nearly as good at being
inobtrusive but automatically parsed.

Well that surely depends on the colour scheme you have. My editor is
fairly restrained -- it uses a handful of colours (although of course you
can customize it and go nuts), and I've made it even more subtle.

To my eyes, the feature of syntax highlighting that alone makes it
worthwhile, its killer feature, is that I can set comments and docstrings
to grey. When I'm scanning code, being able to slide my eyes over greyed-
out comments and docstrings and ignore them with essentially zero effort
is a huge help. That's the thing I most miss, more than anything else,
when using a dumb editor.

Yes. But syntax highlighting won't help you here -- at least, I've
never yet seen any editor that showed precedence relations or anything
similar in its coloring.

Just because nobody has done it yet doesn't mean that some sufficiently
intelligent software in the future couldn't do it :)
 
S

Seebs

I don't know about how other people speed-read, but I can assure you that
when my wife speed-reads, she's not just scanning a few words and
interpolating between them. She can give you a detailed summary of what
*actually* happened, not just a good guess. Including pointing out any
spelling or grammatical errors and clumsy writing. *Especially* the
spelling errors, they have about the same effect on her reading speed as
a tree trunk lying across a Formula 1 race track.

Yeah. I think it's because the entire trick is to have a nice smooth
pipeline, and the error-checking mechanism has to be pretty alert for that
to work -- you have to know if something went wrong. And a spelling error
in the text is initially indistinguishable from a reading error in the
eye...

-s
 
S

Seebs

That's almost certainly because you've been listening to, speaking,
reading and writing English since you were a small child, and the syntax
and grammar of English is buried deep in your brain.

Yes. But I've been programming long enough that I seem to get similar
results in most languages pretty quickly.
And you certainly do think about nouns and verbs, you just don't
*consciously* think about them.

Well, yes. But it's conscious think time that's the limiting resource
for the most part -- so if I can avoid things that require conscious
thought, that frees up more for thinking about the problem.
you will probably recognise "bloog" as the verb and "mobblet" as the
noun, even though you've almost certainly never seen those words before
and have no idea what they mean. But if I write this:
"Susan is mobblet the blooged."
you'll probably give a double-take. The words don't look right for
English grammar and syntax.

Well, actually, at that point I just assume you missed the capital
m on what is apparently a proper noun. :)
I've been reading, writing and thinking in Python for well over a decade.
The syntax and grammar is almost entirely invisible to me too. No
surprise there -- they are relatively close to that of the human
languages I'm used to (English). But if I were a native Chinese or Arabic
speaker, I'd probably find Python much less "natural" and *would* need to
explicitly think about the syntax more.

That's a fascinating question. I don't think that would be the case, though.
Or at least. If you've used more than a couple of programming languages that
much, I wouldn't expect it to be the case. I'm not a native speaker of
Chinese, but after a year in China, I stopped perceiving grammar and just
heard sentences. (Sadly, I've mostly since lost the vocabulary, leaving
me with the annoyance of a language I can think in grammatically but can't
express much of anything in.)
Exactly. Things are highlighted because of *what* they are, not because
of the syntax they use or because of the grammatical role they play.

Hmm, interesting point. e.g., a function name is likely to be highlighted
the same whether I'm calling it or referring to it as an object. (I'm
very new to Python, so I'm not 100% sure functions are a kind of an object,
but I seem to recall they were.)

I guess that's a point; "syntax coloring" is perhaps not the right word
either for what they do.
In a Python expression like:
y = none or None
an editor might colour "None" green because it's a known keyword, but
"none" black because it's a variable. If you change the syntax:
y = None if [none][0] is None else {None: none}[None]
the colours remain the same. None is coloured green not because of
*where* it is in the syntax tree, but because of *what* it is. Calling
this "syntax highlighting" is misleading, or at least incomplete.

This strikes me as correct. But it's not exactly semantics, either.
It's... I dunno what to call it.
In some languages, built-in functions truly are special, e.g. they are
reserved words. That's not the case for Python. Nevertheless, the editors
I've used treat built-ins as "pseudo-reserved words" and colourise them.

Interesting. I wonder why. I guess just because if you meant to name
a variable with one of those words, maybe you'd want the reminder.
Well that surely depends on the colour scheme you have.

Only partially. The big thing, I think, is that punctuation is separate
things next to words, not attributes of words. Come to think of it,
that may be why I sometimes like to see keywords and other times punctuation.
I like {} better than do/end, for the same reason I prefer parentheticals
in English to something like:

And this is a digression contrived end digression example.

I much prefer:
And this is a (contrived) example.
To my eyes, the feature of syntax highlighting that alone makes it
worthwhile, its killer feature, is that I can set comments and docstrings
to grey. When I'm scanning code, being able to slide my eyes over greyed-
out comments and docstrings and ignore them with essentially zero effort
is a huge help. That's the thing I most miss, more than anything else,
when using a dumb editor.

That makes some sense. In sh/python/Ruby/lua, I don't have any troubles
with it because the comment mechanism is fairly unambiguous. I'm fine in
C as long as people remember the * on the left hand side of long comments.
Omit that, and I get fussy. :)
Just because nobody has done it yet doesn't mean that some sufficiently
intelligent software in the future couldn't do it :)

True.

It raises a curious question. Imagine that you had the option of having
color highlighting to show precedence and/or grouping in complicated
expressions. Would that be better or worse than parentheses?

For instance, consider the classic:

x + y * z

In many programming languages, this is equivalent to:

x + (y * z)

But would it be clearer to just have the unpunctuated text, with the "y * z"
in, say, a slightly lighter or darker shade? I don't *think* so, but I'm
honestly not totally sure.

-s
 
S

Steven D'Aprano

I'm
very new to Python, so I'm not 100% sure functions are a kind of an
object, but I seem to recall they were.

Yes, functions are objects. *Everything* in Python is an object (apart
from statements, but they're not actually *things* in Python).

.... return f.x
....-1


For some interesting glimpse at how Python works, create a function f and
then look at dir(f).
 
J

John Bokma

Seebs said:
Now that you explain it like this, that makes a fair bit of sense. I
often wonder whether reading slowly would be more pleasant. I have no
idea how to do it, so the question remains theoretical.

By practicing ;-). I have it worse with movies, but in my case, for
several reasons, it's really important (to me) that I watch the movie at
it's normal pace and try to enjoy it at that speed.

Talking about reading: if you have any suggestions, feel free to email
me (since this is already way off topic). I read mostly in my second
language, English, and live in a country where English books are hard to
find, so browsing in bookshops is not much of an option :-(. And most of
my (online) friends don't read much, if at all.
 
J

John Bokma

Steven D'Aprano said:
spelling or grammatical errors and clumsy writing. *Especially* the
spelling errors, they have about the same effect on her reading speed as
a tree trunk lying across a Formula 1 race track.

Spelling errors are a disaster, somehow they stand out like they use
Comic Sans Bold and red ink. Most likely because they break the
pattern. I seem to find them more and more often in the books I read,
maybe because I use English (my second language) more and more.

As for speed reading, there are many levels to do this: one can call
scanning a page really fast left-right, moving as fast to the bottom as
possible speed-reading, or reading each and every sentence just as fast
as possible speed reading. The faster one goes, the more is lost.

The total # of pages in Harry Potter seems to be just over 4000 [1]. If
an afternoon is 4 hrs, this means 1000 pages an hour, or 17
pages/minute. One has to do skimming to read that fast.

With 250 words/page the reading speed would be over 4K words/minute,
which would make your wife a serious competitor for Anna Jones (4.7K
words/minute, 67% comprehension, see [2])

In my native language I read just above 1 page a minute, if the pages
are not too dense I can do sometimes 2. In English I can often get close
to 1 page a minute, except with books that are quite dense (think
fantasy). So I guess around 300-350 wpm in Dutch, 250 wpm in English
(normal pace).

[1] http://wiki.answers.com/Q/What_is_the_total_number_of_pages_in_the_'Harry_Potter'_series
[2] http://en.wikipedia.org/wiki/Speed_reading#Claims_of_speed_readers
 
D

Dennis Lee Bieber

I don't know about how other people speed-read, but I can assure you that

I don't, but as I recall, traditional speed-readers are taught to
NOT scan the line with the eyes -- they are to read the entire line from
while the eyes are focused on the center and to mentally parse out the
words from the image.
 
D

Dennis Lee Bieber

the colours remain the same. None is coloured green not because of
*where* it is in the syntax tree, but because of *what* it is. Calling
this "syntax highlighting" is misleading, or at least incomplete.
Except for strings and comments, most editors I've seen fall into
"reserved word" highlighting. Strings and comments are the closest to
syntax marking, as they require knowing the behavior of the ", ', and #
(in Python) -- otherwise

x = "some # of things "

might be flagged as

<assignment> x = <string> "some </string></assignment> <comment># of
things "</comment>

I'd hate to consider a highlighting editor for classic FORTRAN which
did not have reserved words, nor was white space significant -- one
would need to implement a nearly full language parser just to properly
colorize:

D O1 0I = 3,14159
vs
DO 10 I = 3.14159

{For those not familiar with FORTRAN, the first is a DO loop, with
terminal statement labeled "10" and loop variable "I"; the second is an
assignment to a variable "DO10I"}
 
A

Antoon Pardon

Because that's what 'if' and 'else' mean.
My point is, I don't want the order of the clauses in if/else to change.
If it is sometimes "if<condition> <true-clause> else<false-clause>", then
it should *ALWAYS WITHOUT EXCEPTION* be condition first, then true clause,
then false clause. If it's sometimes "if condition true-clause else
false-clause", and sometimes "true-clause if condition else false-clause",
that's a source of extra complexity.

[snip]
Have you read PEP 308? There was a lot of discussion about it.

Interesting, in the historical section we see:

The original version of this PEP proposed the following syntax:

<expression1> if <condition> else <expression2>

The out-of-order arrangement was found to be too uncomfortable
for many of participants in the discussion; especially when
<expression1> is long, it's easy to miss the conditional while
skimming.

But apparently those objections were either unknown or disregarded when
the syntax was later adopted.

Not necessarily. Some of us have the impression that Guido deliberatly
chose an ugly format for the ternary operator. Guido has alwasys been
against a ternary operator but the requests kept coming. So eventually
he introduced one. But the impression is that he chose an ugly format
in the hope of discouraging people to use it.
 
T

Terry Reedy

On 09/19/2010 10:32 PM, John Bokma wrote:

I sometimes watch movies (or parts thereof) on 1.5x, especially if it
has a lot of 'filler' scenes. But only when my wife is not watching, as
she hates it.
 
S

Seebs

Not necessarily. Some of us have the impression that Guido deliberatly
chose an ugly format for the ternary operator. Guido has alwasys been
against a ternary operator but the requests kept coming. So eventually
he introduced one. But the impression is that he chose an ugly format
in the hope of discouraging people to use it.

If true, that is an *awesome* etymology for a hunk of a programming
language.

-s
 
M

Mark Lawrence

On 19/09/2010 22:32, Seebs wrote:
Because that's what 'if' and 'else' mean.
My point is, I don't want the order of the clauses in if/else to change.
If it is sometimes "if<condition> <true-clause> else<false-clause>", then
it should *ALWAYS WITHOUT EXCEPTION* be condition first, then true clause,
then false clause. If it's sometimes "if condition true-clause else
false-clause", and sometimes "true-clause if condition else false-clause",
that's a source of extra complexity.

[snip]
Have you read PEP 308? There was a lot of discussion about it.

Interesting, in the historical section we see:

The original version of this PEP proposed the following syntax:

<expression1> if<condition> else<expression2>

The out-of-order arrangement was found to be too uncomfortable
for many of participants in the discussion; especially when
<expression1> is long, it's easy to miss the conditional while
skimming.

But apparently those objections were either unknown or disregarded when
the syntax was later adopted.

Not necessarily. Some of us have the impression that Guido deliberatly
chose an ugly format for the ternary operator. Guido has alwasys been
against a ternary operator but the requests kept coming. So eventually
he introduced one. But the impression is that he chose an ugly format
in the hope of discouraging people to use it.

I very much like the format of the Python ternary operator, but I've
never actually used it myself :)

Cheers

Mark Lawrence.
 
S

Steven D'Aprano

Not necessarily. Some of us have the impression that Guido deliberatly
chose an ugly format for the ternary operator.

If he did, then he must have changed his mind, because there is nothing
ugly about the ternary operator we ended up with.

Guido has alwasys been
against a ternary operator but the requests kept coming. So eventually
he introduced one. But the impression is that he chose an ugly format in
the hope of discouraging people to use it.

That's sheer and unadulterated nonsense. The fact is that Guido changed
his mind about ternary if after discovering that the work-around

true-clause and condition or false-clause

is buggy -- it gives the wrong answer if true-clause happens to be a
false value like [], 0 or None. If I recall correctly, the bug bit Guido
himself.

The and-or hack, which was *very* common in Python code for many years
and many versions, follows the same pattern as ternary if:

true-clause if condition else false-clause

It astounds me how the Python community changed it's collective mind from
admiration of the elegance and simplicity of the expression when it was a
buggy hack, to despising it when it became a bug-free language feature.

Go figure.
 
G

Gregory Ewing

AK said:
One definite advantage would be that if, say, it takes you 70 pages of a
given novel to figure out whether you like it enough to continue,

If there was that much doubt, I would give up long before
reaching the 70 page mark, regardless of reading speed.
If I'm not hooked by the first page, things are unlikely
to get better later.
 
S

Seebs

If he did, then he must have changed his mind, because there is nothing
ugly about the ternary operator we ended up with.

Empirically, looking at the commentary on the PEP, there is something
about it which a large number of participants found awkward or dislikeable.
I'm not sure I'd say "ugly", but I would say that the net result is
that it is likely more error-prone than an arrangement which put the
conditions and clauses in the order they're in for other conditionals.
It astounds me how the Python community changed it's collective mind from
admiration of the elegance and simplicity of the expression when it was a
buggy hack, to despising it when it became a bug-free language feature.
Go figure.

Well, if I had to point at an explanation, I'd guess it's the inversion.
That, and things of the general form "x or y" are found in several other
scripting languages, so it's a more familiar idiom.

But I suspect some of it is just that, well, a number of people as of the
original PEP discussion on the conditional operator disliked the "new"
proposal of "x if y else z", and if they still dislike it, they'll find
the conditional operator unpleasant even if it's bug-free.

This may come as a total shock, but in modern scripting languages, people
are often substantially concerned with style, not just with whether or not
something works. :) I would probably tend to avoid the Python conditional
as it currently exists because I know that a fair number of people will
find it mildly confusing.

Not that they won't be able to parse it, but... It's like phrasing tests
negatively. It adds one chunk to the cognitive load of reading something.
It increases the likelihood of the user making mistakes -- and no amount
of polish and debugging of the engine can prevent users from making mistakes.
The implementation in the language engine may be bug-free, but I'd be less
optimistic about code which used the construct.

-s
 
J

John Bokma

Terry Reedy said:
I sometimes watch movies (or parts thereof) on 1.5x, especially if it
has a lot of 'filler' scenes. But only when my wife is not watching,
as she hates it.

Heh, my question was somewhat rhetorical. I watch movies to slow myself
down, so 1.5x is not an option. Same reason I read books slow (well, at
a normal pace). If I don't I end up not being able to sleep and things
go downhill from there. Also, I notice that if I want to run everything
on 1.5x (or more) that I become quite annoying to the rest of my family
and I want to enjoy them now, when I can.
 
L

Lie Ryan

Basically, think of what happens as I read each symbol:

x = x + 1 if condition else x - 1

Up through the '1', I have a perfectly ordinary assignment of a value.
The, suddenly, it retroactively turns out that I have misunderstood
everything I've been reading. I am actually reading a conditional, and
the things I've been seeing which looked like they were definitely
part of the flow of evaluation may in fact be completely skipped.

Seems like you've got a much more complicated parser than I do. You can
read code from top-down, bottom-left; doing that would require keeping a
mental stack of the contexts of code all the time; that's something I
can't do.

What I normally do when reading code is to first scan the overall
structure first, then zoom in to the interesting parts until I get to an
atomic structure (something I can understand in a quick glance). I'd
then agglutinate some of the atoms into bigger chunks that I don't read
in detail anymore.

Reading code left-right is too difficult for me since then, you'd have
to keep a stack that tracks how many logical parentheses (i.e. how deep
in the code structure) you've seen and their types.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top