[perl-python] Python documentation moronicities (continued)

Steve Holden · Apr 25, 2005

Robert said:
Xah said:

I have produced my doc.
( http://xahlee.org/perl-python/python_re-write/lib/module-re.html )

isn't there a hundred dollars due to me?

Click to expand...

No.

[Steve Holden]

.... and I haven't received a single email yet. I should also point out
that you originally said

To which I replied

I will personally pay you a hundred dollars if you can find enough
> time between now and this time next week - you should be able to find
> "a few hours" in 168 without unduly conveniencing yourself.

That offer was made on 4/12, and it's 4/25 today, so you're six days
late. Since I don't want to be thought of as a welsher, however, the
offer still stands despite the delay. I await five emails from regular
c.l.py posters confirming that they think your version is better than
the one in the documentation.

Having read it I'm not sure *why* you believe your version is better,
but I am nevertheless impressed that you did get around to it.

regards
Steve

Xah Lee · Apr 26, 2005

Dear Steve Holden,

the rewrite of the regex doc is instigated by your offer.

it is published and announced here on April 18th. If you deem it
proper, paypal me. It will be to your credit and easier to incorporate
into the main doc.

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/

Steve Holden · Apr 26, 2005

Xah said:
Dear Steve Holden,

the rewrite of the regex doc is instigated by your offer.

it is published and announced here on April 18th.

I'll have to take your word for that.

If you deem it

proper, paypal me. It will be to your credit and easier to incorporate
into the main doc.

I still await the specified five emails preferring your version to the
current documentation.

regards
Steve

Richie Hindle · Apr 26, 2005

[Xah]

I have produced my doc.
( http://xahlee.org/perl-python/python_re-write/lib/module-re.html )

isn't there a hundred dollars due to me?

I don't have the time to write a full review of your version, but for the
record I've compared it with the original and I don't think it's a
significant improvement (apart from the title - "String Pattern Matching"
is a better title than "Regular expression operations"). (And no, I'm not
sure I could do any better, but that's not the question.)

[Xah]

it is published and announced here on April 18th.
[Steve]
I'll have to take your word for that.

Xah is right - I have a copy here of his message of 18th April, saying "i
have rewrote the Python's re module documentation.".

Mike Meyer · Apr 26, 2005

Claudio Grondi said:
From my point of view both, the existing and
the proposed documentation assume some
knowledge about regular expressions as
such, so doesn't really explain, beeing
limited to showing the syntax of usage.

It's not clear that this belongs in the Python documentation. The
smtplib documentation doesn't explain what an smtp server is, and why
you'd want to contact one. It just tells you how to go about doing
those things - in other words, the syntax of usage.

Possibly a link to a number of resources like <URL:
http://dmoz.org/Computers/Programming/Languages/Regular_Expressions/ >
should be added to the documentation?

In other words, I am missing the "Pythonic
approach" here.

I don't think there's an agreed-upon "Pythonic approach" to regular
expressions. Some people swear by them. Other swear at them.

<mike

Claudio Grondi · Apr 26, 2005

Conclusion:
---------------
I agree with Bill Mill saying
"I'd suggest that he [Xah Lee] actually make
an effort at improving the docs before
submitting them."
so I am still waiting for the final version
before deciding which docu is
better, believing, that if Xah Lee puts
more work and serious efforts into his
attempt it could result in a docu clear
superior to the existing one.

Argumentation:
---------------------
Both documentations are different approaches
and it is currently hard for me to tell which one
is really better (the proposed is better structured,
the existing provides information missing
in the proposed).
In both, the existing and the proposed
documentation I am still missing a good
introductory part showing what regular
expressions are good for and
what are the limitations, i.e. when is it better
to use self-written code instead of regexs
(including examples of both, very simple
and more complex real-world problems).
I would also miss some of the information
removed from the docs in the proposed
documentation.
From my point of view both, the existing and
the proposed documentation assume some
knowledge about regular expressions as
such, so doesn't really explain, beeing
limited to showing the syntax of usage.
In other words, I am missing the "Pythonic
approach" here.

Claudio
P.S. sorry for not including details in my comment,
but to show what I mean would require rewriting
the docs ...

Dear Steve Holden,

the rewrite of the regex doc is instigated by your offer.

it is published and announced here on April 18th. If you deem it
proper, paypal me. It will be to your credit and easier to incorporate
into the main doc.

Xah
(e-mail address removed)
? http://xahlee.org/

Brian Quinlan · Apr 26, 2005

I think that there are some nice ideas in the new version e.g. "Regex
functions" is a nicer title than "Module contents", examples, caveats.

But there are some organizational problems and the actual writting is a
bit weak.

Cheers,
Brian

Peter Hansen · Apr 26, 2005

Steve said:
I still await the specified five emails preferring your version to the
current documentation.

So, for the record Steve, how many of those emails have you
received to date? (And how many from *anyone*, not just
regulars, proclaiming Xah's version better?)

-Peter

Ivan Van Laningham · Apr 26, 2005

Hi All--

Richie said:
Xah is right - I have a copy here of his message of 18th April, saying "i
have rewrote the Python's re module documentation.".

Which announcement alone I take as evidence sufficient unto itself. I
shall not be reading the "rewrote" documentation.

Metta,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works
http://www.andi-holmes.com/
http://www.foretec.com/python/workshops/1998-11/proceedings.html
Army Signal Corps: Cu Chi, Class of '70
Author: Teach Yourself Python in 24 Hours

Steve Holden · Apr 26, 2005

Peter said:
So, for the record Steve, how many of those emails have you
received to date? (And how many from *anyone*, not just
regulars, proclaiming Xah's version better?)

-Peter

That would be none, Peter, as of right now.

regards
Steve

Xah Lee · May 5, 2005

I have now also started to rewrite the re-syntax page. At first i
thought that page needs not to be rewritten, since its about regex and
not really involved with Python. But after another look, that page is
as incompetent as every other page of Python documentation.

The rewritten page is here:
http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html

It's not complete, but is a start. The organization is largely taken
care of, except the last few paragraphs. The bottom half on capturing
and extension syntax i haven't started working on. In particular, they
need examples. The â€œrepetitionsâ€ section also needs to be examed.

here are few notes on this whole rewriting ordeal.

-------------------

In the doc, examples are often given in Python command line interface
format, e.g.
.... return n+1
....2

instead of:

def f(n):
return n+1
print f(1) # returns 2

the clean format should be used because it does not require familiarity
with Python command line, it is more readable, and the code can be
copied and run readily.

A significant portion of Python doc's readers, if not majority, didn't
come to Python as beginning programers, and or one way or another never
used or cared about the Python command line interface.

Suppose a non-Python programer is casually shown a page of Python doc.
She will get much more from the clean example than the version
cluttered with Python Command line interface irrelevancies.

Suppose now we have a experienced professional Python programer. Upon
reading the Python doc, she will also find examples in plain code much
more readable and familiar, than the version plastered with Python
Command line interface irrelevancies.

The only place where the Python command line look-and-feel is
appropriate is in the Python tutorial, and arguably only in the
beginning sections.

-----
Extra point: If the Python command line interface is actually a robust
application, like so-called IDE e.g. Mathematica front-end, then things
are very different. In reality, the Python command line interface is a
fucking toy whose max use is as a simplest calculator and double as a
chanting novelty for standard coding morons. In practice it isn't even
suitable as a trial'n'error pad for real-world programing.

Extra point: do not use the fucking stupid meaningless jargon
â€œinterpreterâ€. 90% of its use in the doc should be deleted. They
should be replaced with "software", "program", "command line
interface", or "language" or others.

(I dare say that 50% of all uses of the word interpreter in computer
language contexts are inane. Fathering large amounts of
misunderstanding and confusion.)

-----
history of Python are littered all over the doc. e.g.
â€œIncompatibility note: in the original Python 1.5 release, maxsplit
was ignored. This has been fixed in later releases.â€

99% of programers really don't need to give a flying **** about the
history of a language. Inevitably software including languages change
over time, however conservative one tries to be. So, move all these
changes into a "New and Incompatible changes" page at some appendix of
the lang spec. This way, people who are maintaining older code, can
find their info and in one coherent place. While, the general
programers are not forced to wade thru the details of fuckups or
whatnot of the past in every few paragraphs. (few exceptions can be
made, when the change is a major fuckup that all practicing Python
coders really must be informed regardless whether they maintain old
code.)

------

do not take a attitude like you have to stick to some artificial format
or order or "correctness" in the doc. Remember, the doc's prime goal is
to communicate to programers how a language functions, not how it is
implemented or how technically or computer scientifically speaking.

In writing a language documentation, there is a question of how to
organize it. This is a issue of design, and it takes thinking.

When a doc writer is faced with such a challenge, the easiest route is
a no-brainer by following the way the language is implemented. For
example, the doc will start with â€œdata typesâ€ supported by the
language. This no-brainer stupidity is unfortunately how most language
docs are organized by, and the Python doc is one of the worst.

One can see this phenomenon in the official doc of Python's RE module.
For example, it begin with Regex Syntax, then it follows with â€œModule
contentsâ€, then Regex Objects, then Match Objects. And in each page,
the functions or methods are arranged in a alphabetical order. This is
typical of the no-brainers organization following how the module is
implemented or certain â€œcomputer scientific logicâ€. It has remote
connection to how the module is used to perform a task.

In general, language docs should be organize by the tasks it is
supposed to accomplish, then by each module or function's
functionalities.

For example, the RE module doc, organize it by the purposes of the
module. To begin, we explain in the outset that this module is for the
purpose of search or replacing a string by a pattern. Then, we organize
with purpose and functionalities as guide.

Since Python provides a set of functions and a Object-Oriented set, we
create a page for each set, with a clear indication on how they relates
to the string pattern search/replace task. Since Python returns the
result as a special Object, we again create a section MatchObject and
clearly tells the reader what that page is about in relation to the
task. And, we also put the regex syntax on its own page, but again made
it clear what this page means in relation to the task. And in each
page, we again organize them by the guide of tasks and functionalities.
(for example, not alphabetical or some machinery logic) In this way,
the whole RE module doc is oriented to programing, not how this module
happens to be classified according to some Python idiosyncrasies or
categorization by some forced â€œcomputer scienceâ€ outlook.

The complete rewritten doc is here:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

-----

There were more issues and notes... but this will be it for today.

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/

Fredrik Lundh · May 5, 2005

Xah said:
I have now also started to rewrite the re-syntax page. At first i
thought that page needs not to be rewritten, since its about regex and
not really involved with Python. But after another look, that page is
as incompetent as every other page of Python documentation.

The rewritten page is here:
http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html

It's not complete

and it no longer describes how things work. study the inner workings
of the RE engine some more, and try again.

</F>

alex23 · May 6, 2005

Xah said:
99% of programers really don't need to give a flying **** about the
history of a language.

Ironically, I'm pretty confident that the same percentage of readers on
this group feel _exactly the same way_ about your 'improvements'.

-alex23

=?ISO-8859-1?Q?Andr=E9_Roberge?= · May 6, 2005

alex23 said:
Ironically, I'm pretty confident that the same percentage of readers on
this group feel _exactly the same way_ about your 'improvements'.

-alex23

I take it that when you use the expression "same percentage", you must
mean within a percent or so!
André

Bryan · May 6, 2005

Xah said:
Extra point: If the Python command line interface is actually a robust
application, like so-called IDE e.g. Mathematica front-end, then things
are very different. In reality, the Python command line interface is a
fucking toy whose max use is as a simplest calculator and double as a
chanting novelty for standard coding morons. In practice it isn't even
suitable as a trial'n'error pad for real-world programing.

i disagree with this 110%. i write python and jython code everyday at my
company and the python interpreter (or command line interface) is always running
on my computer whether it's from the command prompt, idle, pythonwin, pyshell,
etc.. using the interpreter while you are coding is an invaluable tool and
actually helps speed up software development which is opposite of what was
stated by xah lee. it allows complete freedom to experiment reducing the amount
of bugs that are in the real product. it's also useful to use the pywin modules
and experiment with the win32 api interactively, or use the jython interpreter
and experiment with some java api without any compilation step. i have never
found these interpreters to be anything but very robust and *IT IS SUITABLE* as
trial'n'error pad for real-world programming. the above comment can possible
only be made by someone who doesn't actually use it for real world programming.

bryan

Steve Holden · May 6, 2005

Fredrik said:
Xah Lee wrote:

and it no longer describes how things work. study the inner workings
of the RE engine some more, and try again.

Though I realise I'm not one to gloat about other people's typos, I did
find that "When the LOCALE and UNICODE flags are apples as usual."
really appealed to my imagination. And I thought it was all ones and
zeroes ...

regards
Steve

Jeff Epler · May 6, 2005

To add to what others have said:

* Typos and lack of spell-checking, such as "occurances" vs "occurrences"

* Poor grammar, such as "Other characters that has special meaning
includes:"

* You dropped version-related notes like "New in version 2.4"

* You seem to love the use of <HR>s, while docs.python.org uses them
sparingly

* The category names you created, "Wildcards", "Repetition Qualifiers",
and so forth, don't help me understand regular expressions any better
than the original document

* Your document dropped some basic explanations of how regular
expressions work, without a replacement text:
Regular expressions can be concatenated to form new regular
expressions; if A and B are both regular expressions, then AB is
also a regular expression. In general, if a string p matches A and
another string q matches B, the string pq will match AB. [...] Thus,
complex expressions can easily be constructed from simpler primitive
expressions like the ones described here.
Instead, you start off with one unclear example ("a+" matching
"aaaahh!") and one misleading example (a regular expression that
matches some tiny subset of valid e-mail addresses)

* You write
Characters that have special meanings in regex do not have special
meanings when used inside []. For example, '[b+]' does not mean one
or more b; It just matches 'b' or '+'.
and then go on to explain that backslash still has special meaning; I
see that the original documentation has a similar problem, but this
just goes to show that you aren't improving the accuracy or clarity of
the documentation in most cases, just rewriting it to suit your own
style. Or maybe just as an excuse to write offensive things like "[a]
fucking toy whose max use is as a simplest calculator"

I can't see anything to make me recommend this documentation over the
existing documentation.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCe8kWJd01MZaTXX0RAoJKAJ9SFnR2FZJ0zEZOyO3HWYEDLVQu4wCfQDz1
EkbIOwiWs/xg0Hn4EzmvVA4=
=Qv6B
-----END PGP SIGNATURE-----

Xah Lee · May 7, 2005

HTML Problems in Python Doc

I don't know what kind of system is used to generate the Python docs,
but it is quite unpleasant to work with manually, as there are
egregious errors and inconsistencies.

For example, on the â€œModule Contentsâ€ page (
http://python.org/doc/2.4.1/lib/node111.html ), the closing tags for
<dd> are never used, and all the tags are in lower case. However, on
the regex syntax page ( http://python.org/doc/2.4.1/lib/re-syntax.html
), the closing tages for <dd> are given, and all tages are in caps.

The doc's first lines declare a type of:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

yet in the files they uses "/>" to close image tags, which is a XHTML
syntax.

the doc litters <p> and never closes them, making it a illegal
XML/XHTML by breaking the minimal requirement of well-formedness.

Asides from correctness, the code is quite bloated as in generally true
of generated HTML. For example, it is littered with: <tt id='l2h-853'
xml:id='l2h-853'> which isn't used in the style sheet, and i don't
think those ids can serve any purpose other than in style sheet.

Although the doc uses a huge style sheet and almost every tag comes
with a class or id attribute, but it also profusively uses hard-coded
style tags like <b>, <big> and Netcsape's <nobr>.

It also abuse tables that effectively does nothing. Here's a typical
line:
<table cellpadding="0" cellspacing="0"><tr valign="baseline">
<td><nobr><b><tt id='l2h-851' xml:id='l2h-851'
class="function">compile</tt></b>(</nobr></td>
<td><var>pattern</var><big>[</big><var>,
flags</var><big>]</big><var></var>)</td></tr></table>

If Python is supposed to be a quality language, then its
documentation's content and code seems indicate otherwise.
-----------------------

This email is archived at:
http://xahlee.org/perl-python/re-write_notes.html

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/

â˜„

Xah Lee · May 7, 2005

erratum:

the correct URL is:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

Xah
(e-mail address removed)
âˆ‘ http://xahlee.org/

Skip Montanaro · May 7, 2005

Xah> I don't know what kind of system is used to generate the Python
Xah> docs, but it is quite unpleasant to work with manually, as there
Xah> are egregious errors and inconsistencies.

The main Python documentation is written in LaTeX. I believe most, if not
all, HTML is generated by latex2html. I suspect most of the HTML cruftiness
arises from latex2html.

Skip

Python pyPDF4 code to bookmark pdf based upon date text	1	Jan 18, 2023
Python battle game help	2	Feb 23, 2023
f python?	19	Apr 8, 2012
Python doc problems example: gzip module	10	Aug 31, 2005
Processing in Python help	0	Aug 31, 2022
Python documentation problem	14	Jun 18, 2005
Python Doc Problem Example: os.path.split	12	Sep 18, 2005
source code control and documentation	0	Mar 2, 2014

[perl-python] Python documentation moronicities (continued)

Steve Holden

Xah Lee

Steve Holden

Richie Hindle

Mike Meyer

Claudio Grondi

Brian Quinlan

Peter Hansen

Ivan Van Laningham

Steve Holden

Xah Lee

Fredrik Lundh

alex23

=?ISO-8859-1?Q?Andr=E9_Roberge?=

Bryan

Steve Holden

Jeff Epler

Xah Lee

Xah Lee

Skip Montanaro

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads