Make XML from text

E

Eivind

Hi,

I'm in the process of making an application that should convert a
Word-document into XML. The Word files will use style names such as
Heading1, Heading2 etc. Since the application has to work on both Macs
and PCs I cannot use Word 2003 XML-features, and this Java is the way
to go. Now I'm a little uncertain on how to "attack" this problem. We
already have a Word-application that extract all information from a
Word-document into a text file. This file is tagged, but not in XML.
I'm planning to extract all text from Word, into a Java application
that makes the XML-file.

Any thoughs or suggestion on how to build an XML-file from this tagged
text file?

Eivind Løland-Andersen
 
P

Paul Davis

Any thoughs or suggestion on how to build an XML-file from this tagged
text file?
First you probably want to define some kind of schema for your XML
file. I would suggest going for the Open Document format.
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office
This could help with future maintainability (also, they already have a
schema)

Just curious, what are you using to read the Word docs?
(I'm guessing POI or the libs in Open Office)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,480
Members
44,900
Latest member
Nell636132

Latest Threads

Top