Docbook on Windows - do you have a functional toolchain/workflow?

A

AK

I'm looking for pointers from someone who has a functional Docbook toolchain/workflow on Windows.

Docbook XML to HTML, PDF and MS-Help are the main formats I'm interested in. Based on links from
docbook.org, I have begun understanding and piecing together the various components required for a
Docbook toolchain. The biggest missing piece thus far is identifying a reasonably usable WYSIWYG XML
editor, that allows editing a Docbook XML source file in "Preview" mode using an associated, custom
CSS file. If direct Docbook XML source editing is the only feasible option, then this solution
doesn't work for me.

If anyone has an established/working Windows-based Docbook toolchain/workflow in place that is at
least somewhat along the lines of what I've described above, I'm hoping to learn from that as I try
to setup my own Docbook workflow/toolchain.

Any tips/suggestions along these lines are welcome.

Thanks,
- AK
 
A

AK

did you take a look at StyleVision/Authentic from Altova?

I did try out Altova's products sometime last year on a trial basis. I recall them being really
bloated s/w, very slow to do anything in it, and quite expensive too. The free version, if I recall
correctly, didn't support Docbook.

On a quick look around the site right now, it appears that Authentic® 2006 is not only free, but
also mentions Docbook. However, it seems like the biggest caveat here is that to edit XML in a
"styled" view, you need to have StyleVision - a $500 product.

http://www.altova.com/matrix_a.html
http://www.altova.com/products/stylevision/xslt_stylesheet_designer.html

I would like to hear about anyone's experience using the Altova products in a Docbook workflow,
specifically, which Altova products were used (to get a sense of cost involved) and how well they
work together, in terms of WYSIWYG Docbook XML editing, and conversion to HTML and PDF.

Thanks,
- AK
 
A

Andy Dingley

AK said:
I'm looking for pointers from someone who has a functional Docbook toolchain/workflow on Windows.

Of course -- but is it cheating if I mention that it's running under
Cygwin? :cool:
 
A

AK

Andy said:
Of course -- but is it cheating if I mention that it's running under
Cygwin? :cool:

A Cygwin solution would be acceptable - it's got "win" in the name, close enough :)

Do you have a WYSIWYG Docbook XML editor working under Cygwin?

- AK
 
A

Andy Dingley

AK said:
Do you have a WYSIWYG Docbook XML editor working under Cygwin?

Not wizzywig, no. I don't much care for the things.

If I'm mucking around with DocBook, I've got my snout strongly into the
_structure_ of the document, not the final presentation of it. I care
deeply about the taggage baggage and the various annotations I've got
in there, far more so than I care what it looks like. I've heard of my
readership, I just don't care too much about how the text finally looks
- that's an editor and designer's problem later on.

What I do have is lots of pop-up vocabulary (in the taxonomic /
ontological sense) navigation tools. I'm only a couple of clicks away
from Linnaean species dictionaries, the Getty geographical thesaurus
and Javadocs of the maor class libraries.

It's also quite impractical (modulo super-powered desktop computers) to
do WYSIWYG around something based on extensive XSLT translations of
DocBook. CSS would be OK, but then who uses CSS alone on DocBook?
 
A

AK

Andy said:
Not wizzywig, no. I don't much care for the things.
If I'm mucking around with DocBook, I've got my snout strongly into the
_structure_ of the document, not the final presentation of it. I care
deeply about the taggage baggage and the various annotations I've got
in there, far more so than I care what it looks like. I've heard of my
readership, I just don't care too much about how the text finally looks
- that's an editor and designer's problem later on.

I respect your need to work with the XML structure, and as a matter of fact,
I believe that level control makes the process manageable in its own way.
OTOH, if there is no visual/wysiwyg editing capability,
it is very difficult to get a non-hardcore-techie to edit, format and basically provide content in a
Docbook format...
I am trying to determine the possibility of fulfilling *that* need.
It's also quite impractical (modulo super-powered desktop computers) to
do WYSIWYG around something based on extensive XSLT translations of
DocBook. CSS would be OK, but then who uses CSS alone on DocBook?

Why is it impractical?

- AK
 
J

Joe Kesselman

AK said:
Why is it impractical?

Depends on the XSLT. In the completely general case, this is true --
imagine, for example, a stylesheet which discarded a section of text
when certain conditions were met; how would you go back to edit it
afterward? Also, since XSLT has full random access to the document,
writing an incremental XSLT rendering system could be quite challenging.
Never mind the fact that you then have to re-render whatever the XSLT
produces -- it may not be HTML; it may be XSL-FO, or one of the
XML-to-TeX systems, or something else entirely.

Also, the whole point of docbook is that different renderings are going
to be used in different situations; the intent is that folks should
concentrate on editing the meaning of the document and let the
appearance be dealt with separately. WYSIWIG is precisely the wrong
metaphor.

The solution used by most folks is a compromise -- a relatively basic
editor (maybe with basic XML structural assists), plus a
render-on-demand mechanism (push a button to see what it will look like
in one of the many possible renderings, mostly for reassurance's sake).

If you're working in docbook, work in docbook. Edit text and structure,
and ONLY text and structure. Formatting is not the author's
responsibility in this environment and they should keep their paws off
it and trust the system to Do Something Reasonable.
 
J

Joe Kesselman

Actually, one possible solution might be for the folks doing the basic
editing to work in something more like OpenOffice Writer -- which isn't
as rich as Docbook, but which for that very reason does have a WYSIWYG
editor for its OpenDocument markup -- and then transcode from
OpenDocument to Docbook when you've reached the point where you're ready
to do the expert-level markup. (I know XSLT exists to render docbook
into odf; a quick websearch hasn't found the reverse conversion.)
 
A

AK

Joe said:
Depends on the XSLT. In the completely general case, this is true --

Ok, I get the XSLT-related impracticality.
appearance be dealt with separately. WYSIWIG is precisely the wrong
metaphor.

WYSIWYG is still what I'm hoping to get though... even if it means having a CSS-formatted view for
editing purposes that somehow(*) hides the Docbook structure from someone simply creating/typing in
content into this format.

(*) Thinking aloud - the somehow could be creating a Docbook XML file "template" with a pre-created
valid Docbook XML structure, as required for a particular purpose (say, documenting a specific
product/module). This Docbook structure/template could then be handed to someone for editing
purposes. It's at this point that I'm hoping a WYSIWYG tool would be available so that a person
simply tasked with adding/editing content can focus on that as opposed to dealing with Docbook tags
while editing the raw XML source (an XML tree view instead of raw XML in a regular notepad is not
an acceptable compromise).
If you're working in docbook, work in docbook. Edit text and structure,

I'm trying to separate this out a bit more. Ie, someone able and willing to work with the raw
Docbook XML can get the structure part done, then someone not willing or unable to work with raw XML
would use a WYSIWYG editor to add/edit the actual content.
and ONLY text and structure. Formatting is not the author's
responsibility in this environment and they should keep their paws off
it and trust the system to Do Something Reasonable.

I agree that the final/ultimate formatting can be a system or "backend" process where the look and
feel of the output format will depend on the processing method, etc.

Anyway, I have begun seeing the light at the end of the tunnel as far as getting various output
formats from a Docbook source (I've started creating an Ant-based process), but I still have to play
with the editing aspect more. IMHO, from what I see thus far, I don't think the overall effort of
using Docbook is justified and feasible right now. I'm hoping that I'll find a more user-friendly
editing mode though.

Thanks for all the comments so far. I invite and welcome further feedback from anyone who can share
their firsthand experience using a WYSIWYG tool to produce and edit Docbook content.

- AK
 
A

AK

Joe said:
Actually, one possible solution might be for the folks doing the basic
editing to work in something more like OpenOffice Writer -- which isn't

Actually, now that you mention it, I do recall reading somewhere that OpenOffice "supports" Docbook.
I'm not sure what that means, but it certainly merits looking into.

Thanks,
- AK
 
A

Andy Dingley

AK said:
OTOH, if there is no visual/wysiwyg editing capability,
it is very difficult to get a non-hardcore-techie to edit, format and basically provide content in a
Docbook format...

Last time I did that I wrote a stripped-down Word template for them.
They edited to a small number of pre-defined Word styles, they also had
a toolbar with lots of vocabulary selector pop-ups for setting metadata
properties. There was even a toolbar button to show videos (they were
writing shot-logging descriptions). "Save" was redirected to both a
Word file save, and an export-as-DocBook macro that scanned the Word
document character by character and wrote out the content.

Word's ugly, but people do know how to use much of it without
re-training.


I've since been working on a big ugly non-DocBook CMS for websites. The
basic problems are similar though. My users here were editing as
pseudo-HTML in a crude text editor. Primarily though they're _content_
editors, not designers. I was interested in catching their content with
barely more in it than paragraph breaks and application of some
pre-defined styles. One of the main features of this web site was
"re-homing" articles between sites -- authors simply couldn't know how
their article would appear, because it might appear in six different
places under six totally different "house styles" for each site.
 
A

AK

Andy said:
Last time I did that I wrote a stripped-down Word template for them.
They edited to a small number of pre-defined Word styles, they also had
a toolbar with lots of vocabulary selector pop-ups for setting metadata
properties. There was even a toolbar button to show videos (they were
writing shot-logging descriptions). "Save" was redirected to both a
Word file save, and an export-as-DocBook macro that scanned the Word
document character by character and wrote out the content.

Would you be able to share this Word template and the "export-as-DocBook macro" with me? If you can
also send a sample file or two created (.doc) and saved (.xml) with this solution, that would be a
great bonus. I'm very curious to see how it actually works, especially the resulting Docbook output.

Thanks,
- AK
 
A

Andy Dingley

AK said:
Would you be able to share this Word template and the "export-as-DocBook macro" with me?

I doubt it I'm afraid - owing to my total disorganisation and the fact
that most of my home computing machinery is currently letting the magic
smoke out, I just don't think I'd be able to find it in less than
polynomial time. 8-(

It's pretty easy to write though. The key is that you _do_ have to scan
the Word document character-by-character if you expect to get useful
formatting information out of it. The "DOM" just isn't smart enough to
do it any other way. It's an easy bit of VBA to do, just a bit
counter-intuitive to start out by going in at quite such a low level.
 
A

AK

AK said:
toolchain. The biggest missing piece thus far is identifying a
reasonably usable WYSIWYG XML editor, that allows editing a Docbook XML

I found a thread exactly like this one at an openoffice.org forum and thought I'd reference it here:
http://xml.openoffice.org/servlets/ReadMsg?list=dev&msgNo=2831
Relevant piece:
"
> Is there a wysiwyg docbook editor out there that really hides docbook and really works?
I think that's Nirvana (sp?) IMHO.
I've never seen one that works.
Basically I never expect to round trip from any wysiwyg editor to xml and back again.
You might consider a 'one way' ticket?
Write using restricted styles in OOo, then transform into docbook?
"

Unless someone else has a better solution, the point about editing content using restricted styles
in OOo, then building a custom process to transform/generate Docbook format seems to be the best
Docbook workflow at the moment, in terms of the convenience of editing content and getting to
Docbook format from there.

- AK
 
S

Stefan Ram

AK said:
Unless someone else has a better solution, the point about
editing content using restricted styles in OOo, then building a
custom process to transform/generate Docbook format seems to be
the best Docbook workflow at the moment, in terms of the
convenience of editing content and getting to Docbook format
from there.

Here are two workflows I use:

The first one is similar to what you describe:

I am writing my articles using Microsoft® Word 2000.

I am using a specification for format style semantics called
"Techyle" partially being described in

http://www.purl.org/stefan_ram/pub/techyle

Microsoft® Word 2000 has been customized to make
it easy to mark text and format style have be designed
so that they can be recognized. This might be known as:

http://en.wikipedia.org/wiki/WYSIWYM

Then the document is exported to XML keeping all the
semantic markup and document properties. I have written
a VBA macro for this purpose:

http://www.purl.org/stefan_ram/pub/wrocco_en

The resulting XML then can be converted to any other common
format using XSLT or other known means.

~~

Another approach: Because XML is not so well suited
for manual editing, I have developed a variant of XML,
which I call "Unotal".

http://www.purl.org/stefan_ram/pub/unotal_en

Now, I was able to write the Unotal specification in
Unotal itself.

http://www.purl.org/stefan_ram/unotal/unotal.uno

It was converted to Text and HTML

http://www.purl.org/stefan_ram/ascii/unotal.txt
http://www.purl.org/stefan_ram/html/unotal.html

after it had been transformed to XML and then
processed using well-known XML tools.

So I use my custom markup language as a kind of
XML-frontend.

~~

The above process still is using a vocabular from an RFC (RFC
2629) and some related tools, but I am also developing my own
vocabulary, with the intention to be my personal variant of
Docbook.

The idea might be that one can express his thought first and
foremost in one's own custom syntax and vocabulary, because
this can be modified whatever seems natural.

Then, to use standard tools, this might be converted to
DocBook, TEI, LaTeX or so, and from there to text, HTML, PDF
and so on.

My language is called "Portatext" and a small prelimary,
implemented subset looks as follows:

< !portatext &document
id=722062
name=[Portatext]
title=[Portatext]
< &par [This page was written using Portatext and then ]
[converted to XHTML 1.1 using Java 1.5] >
< &section
heading = < &par [This is the heading of this section] >
< &script < &par [This is contents of this section.] >>>>

"implemented" means: I have written a Java-program to
parse this and convert it to HTML.

OK, maybe this is not anymore what the OP wanted to know ...
 
P

Peter Flynn

AK said:
I respect your need to work with the XML structure, and as a matter of
fact,
I believe that level control makes the process manageable in its own way.
OTOH, if there is no visual/wysiwyg editing capability,
it is very difficult to get a non-hardcore-techie to edit, format and
basically provide content in a Docbook format...
I am trying to determine the possibility of fulfilling *that* need.

Right now there are *no* XML editors suitable for this purpose.
I presented some extensive research at Extreme Markup this summer
demonstrating this [http://epu.ucc.ie/articles/extreme06].

There are many very good near-WYSIWYG editors suitable for the
technically-minded willing to learn something about XML markup, but
none that you can throw at a writer or editor with the markup *entirely*
hidden and say "edit that". There are many reasons for this, some of
them technical, but the principal one seems to be that editor makers
simply don't believe there is a market for such a beast and are
unwilling to invest any time or money even in investigating whether or
not this is true, let alone how to make it happen.

In fact, in the long run, there will be an enormous market for an
editor that lets writers create documents in XML [and I do mean
properly-marked XML, not the presentation-only XML generated by
wordprocessors] without even being aware that it is XML they are
creating.

No-one I know.
Why is it impractical?

It's not quite impractical, but it requires a lot of extra programming
to make such transformations (a) handle the large amount of information
that is necessarily missing when a document is only part-written; and
(b) work in real time when some of the apparently "simple" tasks of
formatting actually require a large amount of processing (cycles or
memory). It's *possible* but we're not quite there yet.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top