Programmatic Alteration of Internal DTD Subset

C

Chris W

Hi All,

I have hundreds of small XML files of the form (extrabeous stuff removed):

<?xml version="1.0"?>
<!DOCTYPE page PUBLIC "-//LOCAL//DTD PAGE 0.1//EN" "page.dtd">
<page>
<graphic boardno="entityname1" />
<graphic boardno="entityname2" />
</page>

that I would like to process into this form:

<?xml version="1.0"?>
<!DOCTYPE page [
<!ENTITY entityname1 SYSTEM "entityname1.gif" NDATA gif>
<!ENTITY entityname2 SYSTEM "entityname2.gif" NDATA gif>
<!NOTATION gif SYSTEM "image/gif">
]>
<page>
<graphic boardno="entityname1" />
<graphic boardno="entityname2" />
</page>

That is, I'd like to load each file, find all the boardno attributes,
insert an ENTITY declaration, insert a NOTATION declaration, and write
the result to a file. The XML markup is unchanged, just the internal
DTD is altered. Finding the boardno attributes in a DOM is trivial, but
manipulating the internal DTD subset and getting it to file is eluding me.

Apart from doing the DTD manipulation as a text file, any suggested tool
sets/approaches. Perl, Python, Java, whatever.

Regards,
Chris W
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top