Edit large xml files

B

billsahiker

What edtor do you use for large(50MB+) xml files that checks well-
formedness?

I am looking for something that can quickly load, edit and save large
xml files and makes sure it is well-formed. I have some that are over
100MB, some with 200K nodes. These are computer generated files (come
to us from large financial instituions) but many have content errors -
markup seems to be correct, but data is wrong and the errors are only
in the first few lines of the file. We drop them into sql server and
look for the errors -we know pretty much what to look for. But I have
to hand the files over to IT and it takes time before I can access
the file. Since he errors are only in the beginning of the file, it
would be nice to not have to go through that process and just edit
them with an xml editor.

Has anyone successfully edited, with an xml editor, files this large?

Bill
 
J

Jürgen Kahrs

What edtor do you use for large(50MB+) xml files that checks well-
formedness?


Are you sure you need an editor to check for well-formedness ?
I use xmllint for checking:

xmllint --noout file.xml

Other readers recommend some other tools.
This is a FAQ here.
 
B

billsahiker

As I stated, " many have content errors ." I need to correct those and
when I save the file, I want to make sure I did not mess up the markup
or do anything that would make it not well-formed.

Bill
 
J

Jürgen Kahrs

As I stated, " many have content errors ." I need to correct those and
when I save the file, I want to make sure I did not mess up the markup
or do anything that would make it not well-formed.

OK, so I suggest this approach:

WHILE xmllint --noout file.xml report errors
DO
edit file.xml with vi or emacs or whatever ASCII editor
END

This works with files that contain several 100 MB of XML data.
 
H

Hermann Peifer

I am looking for something that can quickly load, edit and save large
xml files and makes sure it is well-formed.
(...)
Has anyone successfully edited, with an xml editor, files this large?

I would use vi for editing and xmlwf (or xmllint) for checking
well-formedness.

Opening a 160M xml file, editing something in the first lines and saving
the file back takes me less than 1 minute (on an older laptop). xmlwf
needs 8 seconds for checking well-formedness of the same file.

Hermann
 
B

billsahiker

On Feb 18, 12:41 pm, Jürgen Kahrs <[email protected]>
wrote:
Thanks. That sounds like a good solution if I can get approval -policy
may not allow it. I am surprised that by now some vendor has not come
out with something -like vi -why dont they offer an editor with built-
in xml parser? I would think there would be a big demand for it , and
especially something that lets you edit xml without touching the
markup -there are lots of those, but none that I have found can handle
large files.

Bill
 
A

Andy Dingley

What edtor do you use for large(50MB+) xml files that checks well-
formedness?

I don't. Files this size (IMHE) have to be perfect, or else I reject
them outright. XML just doesn't worrk well as a "document" this size.
It's OK if they're automatically generated and correct, but it's just
not a good idea to start treating them as hand-editable.

I certainly _don't_ edit supplied inbound data files like this. If it
works it's tiresome for me, if it goes wrong, it's my fault. This is
your supplier's problem for not being able to generate the things
correctly - get _them_ to fix it!

I also don't just check well-formedness, I check validity. If I'm
throwing this much data around regularly and no-one has yet thought to
define the schema for it, something is wrong.
 
B

billsahiker

Good points. Nonetheless, I would expect the marketplace to have a
virtual xml editor for files that are too big for the current crop of
editors, but not so big that they could not have been composed
manually(or semi-automatically). I see lots of xml files in that
category. If such a tool exits, I would like to know about it.

Bill
 
P

Peter Flynn

Jürgen Kahrs said:
OK, so I suggest this approach:

WHILE xmllint --noout file.xml report errors
DO
edit file.xml with vi or emacs or whatever ASCII editor
END

This works with files that contain several 100 MB of XML data.

If you use emacs with psgml-mode you can omit the loop.
C-c C-v will parse the document (for well-formedness or validity).

As will every other XML editor...what would worry me is that the OP
appears not to have been using an XML editor to start with...

///Peter
 
P

Peter Flynn

On Feb 18, 12:41 pm, Jürgen Kahrs <[email protected]>
wrote:
Thanks. That sounds like a good solution if I can get approval -policy
may not allow it.

If they're asking you to work with XML without an XML editor then you
need a new job, and they will shortly need a new company.
I am surprised that by now some vendor has not come
out with something -like vi -why dont they offer an editor with built-
in xml parser?

All XML editors have a built-in parser.
That's what makes them XML editors.
I would think there would be a big demand for it , and
especially something that lets you edit xml without touching the
markup -there are lots of those, but none that I have found can handle
large files.

Emacs. I used it today to edit a 450Mb OOXML document.

///Peter
 
J

Joseph Kesselman

Peter said:
If they're asking you to work with XML without an XML editor then you
need a new job, and they will shortly need a new company.

Depends on the application. Large XML docs are often machine-generated,
and I find small ones (order of 10KB) are often just as easy to
manipulate with an ordinary text editor. Though I do use Emacs, which
provides some assistance.

They exist. On the other hand, if you want that kind of assistance with
editing XML you probably want much more than a simple XML parser; you'll
probably want something that can be directed by the schema, at least,
and that means fairly deep integration into the editor.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top