CPAN vs. POD outside of .pm (.pl) files?

I

Ivan Shmakov

I see that CPAN automagically extracts the POD documentation out
of the .pm and .pl files and presents it as HTML.

However, now I decide to split the documentation off the .pm's.
How do I request CPAN to extract my documentation out of
stand-alone POD files instead? (and associate it with the
respective .pm's?)

TIA.
 
I

Ivan Shmakov

What do you mean by 'CPAN'? The CPAN shell doesn't normally do this.
Do you mean search.cpan.org?

Yes, I've meant http://search.cpan.org/ specifically, even
though I've inaccurately referenced the whole cpan.org
infrastructure.

I didn't mean cpan(1).
search.cpan.org already displays a list of all the .pod files in a
distribution under the 'Documentation' section. If a .pm file has no
Pod, and there is a .pod file next to it, it moves the .pod link up
into the 'Modules' section.

(Which makes me wonder where is it documented?)
See for example Net::SSLeay.
It's probably important to get the NAME section of the Pod right. I
don't exactly know how the search.cpan.org/perldoc?foo links it uses
for L<> work, but I suspect they're indexed based on the NAME
section.

ACK, thanks! Hopefully, such indexing won't insist on the use
of a HYPHEN-MINUS (U+002D) there, instead of the arguably more
appropriate EN DASH (U+2013).

(FWIW, http://search.cpan.org/perldoc?Net::SSLeay appears to
work. Yet it uses the conventional HYPHEN-MINUS.)
 
I

Ivan Shmakov

Ben Morrow said:
[...]
Yes, I've meant http://search.cpan.org/ specifically, even though
I've inaccurately referenced the whole cpan.org infrastructure.
I didn't mean cpan(1).
OK. I asked because I think it's possible to configure at least
cpanp to install HTML documentation, and IIRC ActiveState have or
used to patch their CPAN.pm to do the same.

That's interesting. Thanks.

[...]
I don't think it is. search.cpan.org is not part of the CPAN
infrastructure per se, it was just a useful website written by Graham
Barr which was given a domain under cpan.org.

Which seems to make it quite a "part," at least for a casual
user like me.
I believe the intention is that it should index things in the same
way as CPAN.pm and perldoc.

ACK, thanks.
I wouldn't muck about with the formatting of the NAME section.

FWIW, http://search.cpan.org/ seems to handle it just fine.
Consider, e. g.:

http://search.cpan.org/perldoc?Tree::Range::base
http://search.cpan.org/perldoc?Tree::Range::RB

Preasumably, it just indexes the PODs by the filename.
pod2man in particular is quite picky about it, and there are other
tools which rely on the format being right.

As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
considering it a bug (yet to be filed.)

I've not seen any problem with pod2man(1) vs. NAME as of yet.
What should I take note of?

(It appears to assume that --quotes= is a string of two
/octets/, not two /characters,/ though.)
 
I

Ivan Shmakov

Ben Morrow said:
[...]
As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
considering it a bug (yet to be filed.)
It's not a bug. It's part of the syntax of a properly-formatted
perldoc.

Which I see no reason /not/ to extend.
Pod::Checker looks for a hyphen as well.
As I said, don't muck about with the formatting, there's no point.
Note that Pod::Man (at least) converts that hyphen into the roff
escape sequence for an endash (along with other instances of " - "),

Frankly, I consider the unconditional replacement of " - " to be
a hack by itself.

Why, I've seen a Usenet poster who'd use groff to format his
messages. Guess what he'd end up when quoting code?
so if you don't get endashes in the output it's because your
formatter doesn't know how to produce them.

Apparently, the HTML formatter at http://search.cpan.org/
doesn't know how to produce EN DASHes, either.

--cut: http://search.cpan.org/perldoc?Digest --
NAME ^-

Digest - Modules that calculate message digests
--cut: http://search.cpan.org/perldoc?Digest --

Note the HYPHEN-MINUS propagated to the resulting HTML.

... Indeed, my first thought was to use DocBook or XHTML for the
documentation right from the start, so to completely avoid all
those 40 years of formatting mess. Somehow, however, I became
assured that persuading http://search.cpan.org/ to allow for
XHTML documentation would be next to impossible a task, which is
why I've ended up following the mainstream.

Not that I'm particularly happy with it.

(A reminder to myself: suggest updates to [1].)

[1] http://www.tldp.org/HOWTO/DocBook-Demystification-HOWTO/
'groff -man -Tps' at least will get them right.

Please note that -Tps produces not a document, but a program to
be executed (by a PostScript interpreter, in this case), which
has implications to both security and software freedom.

(Not unlike HTML "adorned" with JavaScript, Java, or
Adobe Flash, which became such a commonplace on the Web.)

Therefore, unless there's a very good reason to use PostScript,
my suggestion would be to always stick to PDF. (Or perhaps SVG,
as long as single-page vector graphics is concerned.)
Since the roff emitted by pod2man is normally ASCII-only, what's the
difference?

First of all, pod2man(1) supports --utf8. Then, even if
ASCII-only roff code is requested, pod2man(1) should try to
convert the --quotes= characters to the appropriate roff
escapes, just as it's claimed it does for non-ASCII sources:

[...] Many *roff implementations cannot handle non-ASCII
characters, so this means all non-ASCII characters are converted
either to a *roff escape sequence that tries to create a properly
accented character (at least for troff output) or to "X".
 
I

Ivan Shmakov

'I see no reason not to extend the syntax of HTML to allow Unicode
quotes as well as ASCII. They're *so* much prettier.' (cf. xthread.)

The HTML syntax /allows/ Unicode quotes. Inside the payload,
that is. (Which the text of the NAME section certainly is.)

The good thing about HTML is that it doesn't try to parse
anything outside of (roughly) the <tags /> and &entities;.

Ever.

[...]
Pod is, by design, a somewhat loosely-specified format, mostly or
(originally) entirely in ASCII, which relies on the formatter to make
things look pretty where that's necessary. The format has been
tightened up a little recently (it's no longer considered appropriate
for the formatter to turn random references like 'printf(3)' into
L<>, for instance), but this sort of intuition about punctuation is
entirely expected. Inconsistency is also, necessarily, expected.

... And so is unpredictability.
Apparently not. Apparently whoever wrote the relevant bit of code
didn't think it was terribly important.

But I do. So, assuming that my intent is to provide quality
documentation (both the contents and the form) for the users of
the software I develop, should I satisfy the NAME convention, at
the cost of having to host the /proper/ HTML renditions of the
documentation by myself? Or should I instead disregard the
convention -- used by the developer's tools I won't use myself
anyway, -- to ensure that certain well-known Web resource will
have the documentation rendered properly?
Ewww, yuck. Formats designed by and for pedants.

... Remind me not to ask you about TEI, then...
Heh. I can just picture the conversation... Not to mention that
command-line perldoc would no longer function, making your modules
unusable.

"Command-line perldoc"? What's it?
That is a relevant concern under some circumstances; this is not one
of them.

It's a valid concern whenever the code comes from a generally
untrusted source. Such as from a Web site its author put it to.
(Which is how the documentation for free software packages is
often distributed.)
(I have, somewhat reluctantly, moved to using PDF instead of
PostScript almost exclusively. I like PostScript: it's comfortingly
insane. (The same could be said about Perl.))

Still, I don't quite understand why one might want to use an
ad-hoc graphics language, when there're general-purpose ones,
with a number of graphics libraries to choose from? (And that
includes Perl, BTW.)

Pretty much the same applies to the ad-hoc formatter languages,
such as roff or TeX. Or to the usual hacks, like having the
"document conversion" chain run as follows:

document.foo -> (conversion) -> program -> (interpreter) -> document.bar
Don't be so ridiculous.

Well, looking at the license the software I use to produce
PostScript may attach to the pieces of the code which end up in
the resulting "document" isn't exactly the thing I'd like to
spend my time on.

The same applies to the license for the JavaScript code the Web
sites I visit employ. Which is one more reason to prefer Lynx.
In this particular case I had a rather good reason: my version of
groff doesn't have a -Tpdf device.

To me, it looks much more like a very good reason to update the
particular groff install.

[...]
That's new(ish), and not particularly well-supported.

The more's the effort, the better's the support. And
identifying (and reporting) bugs is part of such an effort.

(Unless the agreement would be to just drop POD altogether, and
move on to the better tools. Which I still hope for, even
understanding all the improbability of such a decision.)
The characters passed to --quotes aren't the characters as they will
appear in the output, they are roff escapes. (I don't really
understand roff, but the characters you pass are inserted directly
into a .ds line.)

Which means that it may require a more thorough code change to
fix the issue. (Thanks for the pointer, BTW.)
Remember, all this stuff comes from perl 5.000, before perl (or
groff, I expect) had any sort of Unicode support.

How could this justify having the bug remain unfixed?
 
I

Ivan Shmakov

Ben Morrow said:
[...]
Which means that it may require a more thorough code change to fix
the issue. (Thanks for the pointer, BTW.)
... no, it means that changing it would break backcompat, so it's
probably not going to happen.

How would it? Currently, the use of --quotes= with arguments
other than the two-octet ones and "none" results in an error.
Having pod2man interpret multi-octet sequences as two-character
ones looks like a "pure" extension.

Besides, that appears to contradict your own claim below that
"pod2man [is not being run] with --quotes for a very long time."
(Not to mention that the whole question of dealing with non-ASCII
command-line arguments is largely unsolved on non-Win32 systems.)

On POSIX systems, I'd expect non-ASCII command-line arguments to
be passed as octet sequences, in the encoding specified by the
LC_CTYPE category.

It has worked for me so far, BTW.
Because noone in a position to change it thinks it's a bug, or thinks
it's worth fixing? We're talking about pod2man here; I doubt
anyone's run it with --quotes for a very long time.

Which appears to make the compatibility concerns irrelevant.
 
I

Ivan Shmakov

Ben Morrow said:
[...]
The HTML syntax /allows/ Unicode quotes. Inside the payload, that
is. (Which the text of the NAME section certainly is.)
No, it isn't. That's exactly my point. The content of the NAME
section is syntax, and it needs to look like
<title> - <abstract>
with an ASCII hyphen. You may not like this, but that's the way it
is.

I don't like, and it won't happen on my Pods.

It's free software, though. Anyone's free take the Pods and
improve (or "improve") them as one sees fit.

[...]
The former. The assumption about NAME formatting is widespread, and
you don't know what types of systems people might be using your
module on or what sort of Pod-formatting tools they might have.
Portability is more important than typographical niceties.

Well, let's see if there'd be any actual bug reports...

[...]
Are you serious? Run 'perldoc Pod::Man' from your shell prompt.

$ perldoc Pod::Man
You need to install the perl-doc package to use this program.
$

So?

But if you mean that perldoc is just a fancy way to extract the
Pods out of the sources, convert them into executable roff code,
interpret them with roff, and produce a kind of "extended" ASCII
document, -- then I'd like to note that the intermediate roff
code is already present on my system, and M-x woman in Emacs
shows it without actually executing it with a roff interpreter.
More generally, providing HTML documentation means you *only* provide
HTML documentation.

It doesn't. Arguably, a profile of even the good old
HTML 4.0 Strict that matches the expressiveness of Pod would be
easier to convert into a variety of formats than Pod itself.

However, I understand that it's a common misconception.
Hopefully, I'd be able to prepare some counter-examples for the
SFD this September.
Pod can be converted to a great many formats (that's the point).

So can be DocBook.

[...]
Perl documentation distributed in any format other than Pod is
worthless, since perldoc can't find it.

? How does this relate to my suggestion to avoid PostScript?
Because my printer speaks PostScript? (Actually it doesn't, but
historically that was the reason for using it.)

Nowadays, the majority of printers speak neither PostScript nor
PDF. It's the host the printer's attached to that does. And
I'd argue that the host speaks PDF more often than PostScript.
Me either. What makes you think the Turing-completeness of the
language used makes any difference to that, though?

It doesn't. Yet I fail to understand the purpose the software
may embed some "hidden" non-code /creative/ work of its
developer into the resulting document.
I very much doubt any such licences are enforceable, in any case;

Perhaps; IANAL.
certainly not if you're just using the document as a document, and
not trying to pick it apart and use the bits to write your own
PostScript driver.

If /I/ release the document in question under, say, CC BY-SA,
how the recipient (licensee) is expected to know that some parts
of the document's own digital representation are in fact under
some other license, issued by a third party?

[...]
That would be the FreeBSD base system. The groff there is not going
to be updated, it's going to be replaced, because newer groffs are
GPLv3.

I was unaware that the FreeBSD developers are opposing GPLv3.
And why do they, BTW? (It was always my impression that FreeBSD
is more lax regarding the licenses than, say, Debian.)

Anyway, isn't it possible to install an (additional) groff
instance from the ports?
We're not going to do that. We *like* Pod.

Slightly tangential to the discussion is the question whether
the "total manpower" of /we/ is currently rising or diminishing?
Personally, when I'm writing technical documentation, I would write
in Pod for choice. I appreciate its lack of clutter.

... And also predictability, structure, the ease to run
structured searches against, etc.?

[...]
 
I

Ivan Shmakov

Ben Morrow said:
[...]
It doesn't. Yet I fail to understand the purpose the software may
embed some "hidden" non-code /creative/ work of its developer into
the resulting document.
Fonts? And they usually have quite restrictive licences, from a
'reusing in other documents' point of view.

Indeed. Though in practice, I fail to recall using any fonts
that fail to meet DFSG for my documents recently.

[...]
By applying common sense. Your copyright and therefore your licence
applies to what you created, that is, to the content of the document.

And how the recipient is intended to know that?

The same applies to the other kinds of works, though. But for
the images and such, I'd expect the copyrights to be properly
stated in the /visible/ part of the document. For the fonts, it
may seem a bit excessive. Still, for the embedded code parts,
-- I wouldn't expect it to happen at all.

[...]
 
I

Ivan Shmakov

Your perl install is broken;

"It's not a bug."
it has been mangled by over-zealous packagers looking to save a few
bytes. You need to find which magic package to install to give you
the whole thing.

Fortunately, with the error message quoted above, the packagers
already made that a trivial thing to do.

JFTR: the purpose of such splits is to avoid the installation of
the documentation on the hosts that merely /run/ existing code,
and aren't used for the actual development. The installs I use
for Perl development are /ought/ to contain "perl-doc," but
I forgot to do it for this particular one, and I'm not quite
inclined to touch it in the foreseeable future. In the
meantime, I use http://search.cpan.org/perldoc? and
http://perldoc.perl.org/. (Besides, Lynx is almost as fancy as
the Emacs' own WoMan browser.)
OK, if that works for you. A lot of Perl programmers use perldoc,
and rely on it working properly, so providing documentation perldoc
can't find is not helpful.

Nowhere have I opposed the use of perldoc. What I contest is
the use of perldoc as the ultimate judge of the documentation
format.

To put it another way: if perldoc can't find the documentation,
-- it's clearly a bug. But it's not necessarily a bug in the
/documentation/ itself.

That being said, I'm (obviously) not familiar with perldoc.
Thus, I'm curious if there's any compelling reason for perldoc
/not/ to support, say, DocBook (or DITA, HTML, TEI, etc.)?

[...]
Why would I want to do that? The only thing I use groff for is to
read manpages.

ACK. Although I've found some other uses for groff on
occasions, as I've stated before, I don't usually use it for
reading the manpages, either.

[...]
If Perl is dying (which it isn't, by the way), it's not because we
don't use DocBook.

I didn't assert either of that. (Even though there're several
definitions of "dying," I guess I understand what you mean.)

My point is that I know of no reason for a programmer looking
for a new language to learn to choose Perl, and I'm not actually
seeing a lot of newcomers to join this group lately, either.
(As compared to, say, and
Why, should they get a Perl course on
Coursera, wouldn't it be rightful to call it "The glorious, and
overly long, history of the Perl programming language", or
something like that?)

The other point to note is that even though I'm using Perl for
almost decade and a half now (on and off), I still can't make
head or tail of it at times. On the contrary, while I have put
virtually no effort to learn Python whatsoever, I seem to
understand the code written in it quite well.

So, there're two reasons for me to stick to Perl. First of all,
it has a rich set of (quality) libraries (although Go, and
perhaps Python, Racket, etc. may surpass it in the near future,
if not already have), which appear to cover my demands well.

The other reason is that Perl isn't of the "one size fits all"
type. Contrast it with Python ("one indentation fits all"), or
Racket ("one package format fits all"), or Go ("one
documentation format fits all")...

... Or is it?

[...]
 
R

Rainer Weikusat

Ben Morrow said:
Your perl install is broken; it has been mangled by over-zealous
packagers looking to save a few bytes. You need to find which magic
package to install to give you the whole thing.

"Please note that whatever precisely constitutes 'the whole thing'
might be subject to change with little or no advance warning based on
what 'certain developers' presently do or don't consider fashionable",
possibly based on outright irrational reasons (or intentionally
disingenious mock arguments. It is not always possible to distinguish
which is which) such as

I find no small irony in a few posts asking whether there's
any reason to use Moose or Mouse or Moo when you can write
your own object accessors by hand (I wrote my own templating
system by hand too. No more.)

Side remark: I should be noted that the suspicion that what was
supposed to be the be-all and end-all of YARFPOO has already again
'logically' splintered into three different semi-compatible
implementations with mutually exclusive design goals is correct. More
to follow as time goes by and old non-solutions to non-problem are
abandoned because their non-maintainers get bored with them and people
discover more aspects in which all of the existing YARFPOOs are deficient
in this or that way and hence, set forth to - once and for all this
time! - nail the jello to the tree in the perfect way 'from scratch'
all over again. Structual similarities to daily soap opera
installments are unintentional but very like not accidental.

I wonder if somebody ever really asked a so thoroughly stupid question
when considering the scope of the moose mouse that mooed versus
'writing accessors'. This looks suspiciously like strawman. 'writing
accessors' is not a particularly sensible activity in its own right:
Objects should provide behaviour and not hierarchically structured
storage. If a particular 'behaviour' can sensibly be abstracted away
from a specific way of handling state information, it shouldn't be
tied to one: The implementation should be capable of working with all
kinds of objects providing a compatible interface and not be part of
any of them. Comparing 'writing accessors' to 'writing a template
system' is again totally bogus: The tasks are of vastly differing
technical complexity.

As a rule of thumb, people resort to sophisms when marketing their
opinions when they can't think of any better way to further their
causes. This may be because they themselves know that they are wrong
or - considering that 'web development' is what the guy who runs the
advertising agency dabbles in - because they're really marketing
specialists to whom all this 'programming stuff' is part of the cost
of displaying advertisements they'd rather (and totally rationally
in this case) want to get rid of.

The problem with this is that computers do more (and much more
sophisticated) things than "being your plastic pal who's fun to be
with" and what minimizes the per-case workload of the guy who decides
on the colour said 'plastic pal' should have today (and hence, helps
him to maximize his income for a given time period) might not be the
most sensible way to construct 24x7 autonomously operating software
system people rely on in order to get some (not inherently
computer-related) job done.
 
I

Ivan Shmakov

[...]
As a rule of thumb, people resort to sophisms when marketing their
opinions when they can't think of any better way to further their
causes. This may be because they themselves know that they are wrong
or - considering that 'web development' is what the guy who runs the
advertising agency dabbles in - because they're really marketing
specialists to whom all this 'programming stuff' is part of the cost
of displaying advertisements they'd rather (and totally rationally in
this case) want to get rid of.
The problem with this is that computers do more (and much more
sophisticated) things than "being your plastic pal who's fun to be
with" and what minimizes the per-case workload of the guy who decides
on the colour said 'plastic pal' should have today (and hence, helps
him to maximize his income for a given time period) might not be the
most sensible way to construct 24x7 autonomously operating software
system people rely on in order to get some (not inherently
computer-related) job done.

Even though I cannot quite parse these two paragraphs (is it
some kind of Perl code, BTW?), I seem to wholeheartedly agree
with this way of thought itself.
 
R

Rainer Weikusat

Ivan Shmakov said:
Rainer Weikusat <[email protected]> writes:
[...]

As a rule of thumb, people resort to sophisms when marketing their
opinions when they can't think of any better way to further their
causes. This may be because they themselves know that they are wrong
or - considering that 'web development' is what the guy who runs the
advertising agency dabbles in - because they're really marketing
specialists to whom all this 'programming stuff' is part of the cost
of displaying advertisements they'd rather (and totally rationally in
this case) want to get rid of.
The problem with this is that computers do more (and much more
sophisticated) things than "being your plastic pal who's fun to be
with" and what minimizes the per-case workload of the guy who decides
on the colour said 'plastic pal' should have today (and hence, helps
him to maximize his income for a given time period) might not be the
most sensible way to construct 24x7 autonomously operating software
system people rely on in order to get some (not inherently
computer-related) job done.

Even though I cannot quite parse these two paragraphs (is it
some kind of Perl code, BTW?), I seem to wholeheartedly agree
with this way of thought itself.

Hmm ... I admit that I'm at least partially guilty of the same thing
(I was criticizing) :). The important meta-bit would be "Think for
yourself".
 
R

Rainer Weikusat

Rainer Weikusat said:
[...]

Hmm ... I admit that I'm at least partially guilty of the same thing
(I was criticizing) :).

OTOH, I can't help thinking that it is a very strange coincidence that
Moose is positively unusable for CGI programs, provided that what is
reported about the time needed to compile it is correct (or - for that
matter - for anything which isn't 'a long-running application'[*]) and
that outspoken proponents of it want to make it more difficult for
people to write CGI programs by removing CGI.pm from the Perl
distribution.

[*] Imagine that people actually write modularized and
object-oriented system configuration tools ...
 
I

Ivan Shmakov

[...]
OTOH, I can't help thinking that it is a very strange coincidence
that Moose is positively unusable for CGI programs, provided that
what is reported about the time needed to compile it is correct (or -
for that matter - for anything which isn't 'a long-running
application' [*]) and that outspoken proponents of it want to make it
more difficult for people to write CGI programs by removing CGI.pm
from the Perl distribution.

CGI.pm appears to mix up CGI support and HTML generation, which
is a thing I see no good reason to do. So, I tend to advocate
in favor of replacing it with CGI::Simple whenever possible.
(My Web pages are generated with XML::LibXML::toFH (), anyway.)

Moreover, thanks to Fast CGI (and FCGI.pm), it's possible to
serve multiple HTTP requests without restarting the CGI code.
And, it may also allow for more flexible privilege separation
(than mod_suexec, anyway.)

As for Moose, I've scanned through the documentation, but didn't
quite grasp its utility as of yet.
[*] Imagine that people actually write modularized and
object-oriented system configuration tools ...
 
R

Rainer Weikusat

Ivan Shmakov said:
Rainer Weikusat <[email protected]> writes:
[...]

OTOH, I can't help thinking that it is a very strange coincidence
that Moose is positively unusable for CGI programs, provided that
what is reported about the time needed to compile it is correct (or -
for that matter - for anything which isn't 'a long-running
application' [*]) and that outspoken proponents of it want to make it
more difficult for people to write CGI programs by removing CGI.pm
from the Perl distribution.

CGI.pm appears to mix up CGI support and HTML generation, which
is a thing I see no good reason to do.

The 'good reason' would be that CGI programs usually both consume
input and produce output. While I'm not particularly fond of the HTML
generation support in CGI.pm, the basic idea of generating HTML using
a procedural interface has something going for it: In this way, it
possible to use a 'high-level interface' whose parts represent the
logical structure of a form. This makes modifications of this logical
structure much easier than when having to deal with some kind of
proto-HTML markup language with an HTML-like syntax and some typically
fairy 'dumb' (insofar programmbility goes) support for value
interpolation. 'Circumstances' forced me to get some first-hand
experiences with JSF and RichFaces and the 'template pages' generated
in this way are generally huge, repetitive angle bracket swamps. I
understand that 'the typical programmer' never uses a loop (or -
heaven forbid - a subroutine) when he can just copy'n'paste identical
code fifteen times in a row (and thus minimize the amount of work
needed for each of the slightly different fifteen individual cases)
but the result of this is an unwieldy and rigid structure which
reminds me of a frozen rubbish heap (extended by throwing more stuff
onto it and waiting for it to freeze but never modified in any other
way as this would be prohibitively expensive).
So, I tend to advocate
in favor of replacing it with CGI::Simple whenever possible.
(My Web pages are generated with XML::LibXML::toFH (),
anyway.)

I've had a look at that. A missing feature I absolutely need jumped at
me immediately: Support for accessing uploaded file data without
creating a temporary file first. Neither CGI.pm not CGI::Simple seem
to be 'maintained' in the sense that anybody bothers to deal with CPAN
bug reports, however, this here

https://rt.cpan.org/Public/Bug/Display.html?id=64160

is an absolute showstopper for me: I have to humour people with a
seriously high level of professional paranoia (yes, I do mean that) and
'open CVEs' are ratpoison in this respect. I can, of course, maintain
the code myself but I could as well just write it myself and the
result would very likely be less buggy and perform better for my use
cases.
Moreover, thanks to Fast CGI (and FCGI.pm), it's possible to
serve multiple HTTP requests without restarting the CGI code.

.... 'we can just write a long-running application instead' ... well,
yes. I've also done that in the past, although based on mod_perl (the
mod_perl I have even behaves like its documentation says it should
because I forced it to ...). But if I can get by with the more
'UNIX(*)-style' approach of using relatively small, independent
cooperating processes, I prefer to do that. Maybe because I'm old
enough that my first impression of this world wasn't the 'designed for
Windows 98' logo but young enough to feel no haste to chase whatever
happens to be modern now because it happens to be modern now (lest I
could be ... left behind !!1) but so be it.
And, it may also allow for more flexible privilege separation
(than mod_suexec, anyway.)

Being able to switch UIDs on UNIX(*) means 'running with elevated
privileges' and if I don't absolutely have to use some 'huge' piece of
software for that whose innards are essentially unknown to me, I
prefer to avoid that. And this means small, setuid-0 C programs which
don't perform any function except 'uid switching', usually, to a
hard-coded persona, and only executable by the user supposed to execute
them.
 
I

Ivan Shmakov

Is there any compelling reason for DecBook et al to not support
perldoc? I'd be happy if the Perl community magically switched to
DocBook, but as a practical matter the cost of converting everything
would be to high.
Yes.

A mixture of two different documentation formats could get very ugly
very quickly.

Why?

To note is that there already /is/ a mixture, assuming that one
uses an operating system which isn't written entirely in Perl.

So, we have pure-roff; roff generated from a variety of sources
(including both DocBook and Pod); AsciiDoc; Markdown; plain text
(mainly READMEs; in ASCII, UTF-8, and, occasionally, other
encodings); Texinfo; and what not. And all of that appears to
work reasonably well in practice.

... Which I've actually mentioned (kind of.) But then, take a
look at, say, [1, 2].

... But it may be a reason worth considering. We're currently
preparing for the local SFD event, and I guess we may invest
some time in writing a dozen or so of blog entries on free
software (and digital freedom in general.) Hopefully, I'd be
able to write something reasonably decent on Perl. Naturally,
CPAN would be the first feature to mention.

[1] http://pypi.python.org/pypi
[2] http://code.google.com/p/go-wiki/wiki/Projects
The most arcane part of Perl is the Regex sybtax;

To me, the most arcane part of Perl is that it behaves as if it
actually /is parsed/ with a bunch of REs.

--cut: https://en.wikipedia.org/wiki/Perl --
[...] One consequence of this is that Perl is not a tidy language.
It includes many features, tolerates exceptions to its rules, and
employs heuristics to resolve syntactical ambiguities. [...]
--cut: https://en.wikipedia.org/wiki/Perl --

The "dualvar" scalars I've recently discovered also do not look
like a particularly clever concept.
Python and Ruby are in the same tradition. I'd really prefer
something more in the tradition of Icon, SNOBOL and Wylbur.

Could you please show some example, comparing the approaches?
"We hates it, precious, we hates the nasty thing." I'll take
semicolons and a prettyprinter, TYVM.

So will I.

And the same for Go, which allows for:

foo = (42 +
bar +
hello);

but not (my preference):

foo = (42
+ bar
+ hello);

(Precisely because a newline after a non-operator is taken for
an "implied semicolon".)
 
R

Rainer Weikusat

[...]
Fortunately, with the error message quoted above, the packagers
already made that a trivial thing to do.

JFTR: the purpose of such splits is to avoid the installation of
the documentation on the hosts that merely /run/ existing code,
and aren't used for the actual development.

In order to accomplish what? According to dpkg --print-avail, the size
of perl-doc is about 6.9M. Except rare special cases, the
inconvenience of not having the documentation at hand in some 'strange
situation' by far outweighs the possible 'space saving' here except
that some people presumably feel that 'documentation' is dead weight
because they would never read it, anyway[*].

[*] Nice little anecdote about that: A former colleague of mine used
to boast that 'only newbies read documentation' ("Nur Anfaenger lesen
Dokumentation"). Once upon a time, he and my boss went to China in
order to perform some demos there for some prospective 'large
customers'. By this time, the server part of the then-product was
usually installed on SuSE Linux systems because that was what said
former colleague always used. Consequently, he went to China with a
brand new 'free SuSE CD'. Nobody ever bothered to test this new
version together with our software and since no 'newbies' where
involved here, nobody bothered to read through the release notes for
incompatible changes, either. The end result of that was that I was
woken by an "It doesn't work and we don't know what to do" phone call
around 3am, had to go the the office and read the documentation for
him in order to determine what the problem was (MySQL default date
output format changed) and to change the software to be able to deal
with that (of course, this guy still makes more money than I do ...).

[...]
My point is that I know of no reason for a programmer looking
for a new language to learn to choose Perl,

It is a highly useful programming language whose 'crudely implemented
Lisp subset' is fairly complete -- you'll even get run-time modifiable
symbol tables and symbols (called globs) -- with support for automatic
management of all kinds of resources and more than decent
performance. Eg, I use OO-Perl to make real-time WWW content-filtering
descisions and the latency of that is in the order of at most about a
dozebn 0.0001s --- that's something Java developers don't even dream
of (OTOH, it is presumably possible to force perl down to
JBoss/Hibernate/SEAM levels by adding enough 'CPAN frameworks' to it).
 
P

Peter J. Holzer

:) Push the right button and Ben sounds like Rainer.
I didn't mean just the docs; I don't know what else might have been
stripped out, and you really need all of it. I presume there is some
perl-complete package you can install which will pull in everything the
CPAN tarball would give you.

Maybe, maybe not. In 10+ years of using Debian (and even more of using
Redhat, which has a similar packaging philosophy) I've never missed
having a package or meta-package which included "everything the CPAN
tarball would give me". I have a working perl installation and when a
feature is missing, it is usually straightforward to find out which
package provides it and install that. I don't care whether a module is
part of the core or not.

hp
 
R

Rainer Weikusat

Ben Morrow said:
Quoth Rainer Weikusat said:
[FCGI]
And, it may also allow for more flexible privilege separation
(than mod_suexec, anyway.)

Being able to switch UIDs on UNIX(*) means 'running with elevated
privileges' and if I don't absolutely have to use some 'huge' piece of
software for that whose innards are essentially unknown to me, I
prefer to avoid that.

I'm not sure what you're saying here.

In the given context, that I wouldn't want to use mod_suexec for
anything as this would necessarily imply that 'the CGI executor' (or
at least some part of it) had to run with elevated privileges by
default and consequently, that mentioning it here is somewhat out of
place.
One of the major advantages of FCGI/HTTP proxying rather than plain
CGI is that it is straightforward to have the webserver and the
application itself running under separate uids, both started
(ultimately) from init, without having anything setuid or running as
root.

One of disadvantages of separating 'a web application' into a 'web
server component' and 'an application server component' is that this
means that one more permanently running daemon must be configured to
run as an untrusted user and possibly even that it must be
specifically told that it shouldn't bind its network sockets to the
wildcard address (like JBoss) and possibly, it will even create
listening sockets bound to the wildcard address nevertheless (like
JBoss 5). Even if said application server doesn't kindly offer its own
'attack surface' (slight misuse of the term), it is still a bunch of
more code reachable from the network.
This is impossible with CGI, since the application is invoked by the
webserver, so either it runs under the webserver's uid or the
webserver has to be able to switch uid.

If 'the application is invoked from the webserver', this means it will
- by default - run as an unprivileged user with no additional
cost. This also includes that any access restrictions affecting this user
will - by default - affect the application as well (with no additional
cost).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top