Docstrings and PEP 3174

C

Carl Banks

PEP 3174 got me to thinking.

There is now a subdirectory to deposit as many *.pyc files as you want
without cluttering the source directory (never mind the default
case). Which means you can pretty much write files with impunity.

So I was wondering: what about a separate file just for docstrings.
__doc__ would become a descriptor that loads the docstring from the
file whenever it is referenced. The vast majority of the time
docstrings just take up memory and do nothing, so why not lazy load
those things?

Yes I know you can use the -OO switch to omit docstrings--but who does
that anyway? I know I never use -O because I don't know if the people
who wrote the library code I'm using were careful enough not to
perform general checks with assert or to avoid relying on the
docstring's presense.

Yeah, it's probably a miniscule optimization, but whatever, I'm just
throwing it out there.

Carl Banks
 
S

Steven D'Aprano

PEP 3174 got me to thinking.

There is now a subdirectory to deposit as many *.pyc files as you want
without cluttering the source directory (never mind the default case).
Which means you can pretty much write files with impunity.

So I was wondering: what about a separate file just for docstrings.

I'm not sure I understand what you mean by that? Do you mean that, when
writing code for a function, you would open a second file for the
docstring? If so, a big -INFINITY from me. The biggest advantage of
docstrings is that the documentation is *right there* with the function
when reading/writing code. If you're suggesting we should write them in
another file, no thank you.

If you mean a runtime optimization with no change to the source file,
then maybe, tell me more. What I *think* you mean is that the coder would
write:

def spam(*args):
"docs go here"
pass

as normal, but when it is compiled and loaded into memory, the docstring
itself was *not* loaded until needed.

If so, then I think you'd need to demonstrate significant practical
benefit to make up for the complexity. I imagine Python-Dev will be very
dubious.

__doc__ would become a descriptor that loads the docstring from the file
whenever it is referenced.

"The file" being the source file, or a separate docstring file, or a
temporary file generated by the compiler, or... ?

The vast majority of the time docstrings
just take up memory and do nothing, so why not lazy load those things?

A guarded and provisional +0 on that. +1 if you can demonstrate real
performance (memory or speed) gains.
 
G

Gregory Ewing

Steven said:
If you mean a runtime optimization with no change to the source file,
then maybe, tell me more.

Note that you don't necessarily need a separate file for this.
It could just be a separate part of the same file.
 
T

Terry Reedy

Note that you don't necessarily need a separate file for this.
It could just be a separate part of the same file.

Which is to say, all the docstrings in a modules *could* be placed at
the end and not normally read in by the interpreter until needed. I have
no idea what it does now, but I suspect not that. It might make module
loading a bit faster.
 
C

Carl Banks

Which is to say, all the docstrings in a modules *could* be placed at
the end and not normally read in by the interpreter until needed.

I'm going to guess that they don't want that in *.pyc files. In PEP
3147 they proposed a fat-format file (so a glob for each version) and
it was not popular.
I have
no idea what it does now,

There's a short header, then the rest of the file is a single
marshaled glob.

but I suspect not that. It might make module
loading a bit faster.

True but still probably a small optimization.


Carl Banks
 
S

Steven D'Aprano

Note that you don't necessarily need a separate file for this. It could
just be a separate part of the same file.

I would disagree with any attempt to move the docstring away from
immediately next to the function in the source file.

You can currently do that if you insist, after all __doc__ is just an
ordinary attribute:
.... pass
....

I don't think we should do anything that *encourages* people to separate
the code and docstrings in the source file.

On the assumption that functions will continue to be written:

def f():
"""Doc string"""
pass


I like the proposal to make f.__doc__ a descriptor that lazily loads the
doc string from disk when needed, rather than keeping the string in
memory at all times. Whether this leads to enough memory savings to
justify it is an open question.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top