[ANN] QuiXML 0.0.0

S

Sean O'Dell

Aug 27, 2003

#1

Announcing the first public release of QuiXML 0.0.0
===================================================
QuiXML Copyright (c) 2003 Sean O'Dell
Released under the GNU General Public License

I've been a big fan of Sean Russell's REXML library for awhile now, but
an internal project in which I embedded the Ruby interpreter required
certain features in an XML library I couldn't quite get out of any one
of the existing API's out there, so I wrote QuiXML.

I've been using the code for quite awhile now, so it should be pretty
solid, but I could use testers and I invite everyone to comment on both
the usefulness of the library, the quality of the code and the
readability of the documentation.

What is QuiXML?
===============

QuiXML is an XML library for Ruby written in C, utilizing the expat
library for parsing XML string buffers. It's very fast.

It uses only Ruby native data structures to store its XML data
internally, so there is no "built-in" way to create XML trees; how they
are generated is completely open.

The library both parses and generates XML, automatically performing
pretty printing and encoding/decoding special characters (<, >, &, ',
and ").

Transmutations to/from attribute string values and Ruby objects can
occur when reading/writing XML, allowing for a limited degree of object
marshaling.

Element path addressing makes it easy to find one or more nodes using
literal strings, regular expressions or any other object which supports
Ruby case-equality to match against XML node names and attributes.

More Information
================

QuiXML's Home Page:
http://quixml.rubyforge.org/

Download QuiXML here:
http://rubyforge.org/download.php/67/quixml-0.0.0.tar.gz

The QuiXML API:
http://quixml.rubyforge.org/DOC.html

See the RAA entry for QuiXML:
http://raa.ruby-lang.org/list.rhtml?name=quixml

Final Thoughts
==============

My thanks to Matz and everyone in the Ruby community. I wish I could
contribute more. Enjoy!

Sean O'Dell

J

Joel VanderWerf

Aug 27, 2003

#2

Sean said:
Transmutations to/from attribute string values and Ruby objects can
occur when reading/writing XML, allowing for a limited degree of object
marshaling.

Looks nice.

Is there a way the transmutation block can get at the full path of the
node, to resolve different uses of attribute names in different contexts?

S

Sean O'Dell

Aug 27, 2003

#3

Joel said:
Looks nice.

Is there a way the transmutation block can get at the full path of the
node, to resolve different uses of attribute names in different contexts?

No, and I have no good ideas on how to do that. Code-wise, it wouldn't
be hard to program if "the idea" were there, but I'm unsure of the right
(pardon my corporish) paradigm for that. Ideas are welcome, though!

Sean O'Dell

B

Ben Schumacher

Aug 27, 2003

#4

Sean O'Dell said:

It uses only Ruby native data structures to store its XML data
internally, so there is no "built-in" way to create XML trees; how they
are generated is completely open.

Sean-

I'm curious about your decision to differentiate between contents and
child elements in the tree. While this probably makes the Tree structure
slightly similar, you end up altering the XML out versus the XML in. So
that your to_xml will never return the same XML that is passed into it.

For example,

qxml = QuiXML::Tree.new
qxml.parse("<a>a bb<c>c</c> \nc</a>")
puts qxml.to_xml

generates the output,

<a>a b
c
b
<c>c</c>
</a>

which obviously alters the structure of the XML. Its probably not a huge
issue for most projects, but I was just curious if you had considered this
in your design. One small feature request, however, do you think you could
add the option to disable the print pretty feature? Maybe make to_xml take
an optional boolean that defaults to true?

Anyway... just some comments. I'll probably toy with this more in the future.

Cheers,

bs.

S

Sean O'Dell

Aug 27, 2003

#5

Ben said:
Sean O'Dell said:

Sean-

I'm curious about your decision to differentiate between contents and
child elements in the tree. While this probably makes the Tree structure
slightly similar, you end up altering the XML out versus the XML in. So
that your to_xml will never return the same XML that is passed into it.

For example,

qxml = QuiXML::Tree.new
qxml.parse("<a>a bb<c>c</c> \nc</a>")
puts qxml.to_xml

generates the output,

<a>a b
c
b
<c>c</c>
</a>

which obviously alters the structure of the XML. Its probably not a huge
issue for most projects, but I was just curious if you had considered this
in your design. One small feature request, however, do you think you could
add the option to disable the print pretty feature? Maybe make to_xml take
an optional boolean that defaults to true?

Anyway... just some comments. I'll probably toy with this more in the future.

Thanks, and please do, I look forward to getting more comments.

My reason for treating content as a single value is because I've only
ever used XML for data, and never as a formatting stream, like HTML.
That's basically all there is to it, really. I used XML a lot for
configuration files, blocks of data (such as for client accounts) and
for passing commands/data between remote machines.

I'm open-minded about it though. Except for formatting streams, such as
HTML and source XML documents for XSLT to transform into something else
like PS or what-not, I couldn't think of a reason to NOT combine all of
an elements data into one content property, so I did that.

How would you like to see it changed? Give me an idea of what I could
do different. Put each scrap of content data in a child element, and
mark it as data instead of a node?

It would be pretty easy to disable pretty printing, of course, and also
stripping whitespace when parsing. Very simple changes.

Sean O'Dell

G

Gavin Sinclair

Aug 27, 2003

#6

I've been a big fan of Sean Russell's REXML library for awhile now, but
an internal project in which I embedded the Ruby interpreter required
certain features in an XML library I couldn't quite get out of any one
of the existing API's out there, so I wrote QuiXML.

I'm curious, what features did you need that weren't already provided?

Gavin

S

Sean O'Dell

Aug 28, 2003

#7

Gavin said:
I'm curious, what features did you need that weren't already provided?

Gavin

Well, quite a few things I couldn't get in a single library.

One was marshaling (transmutation: not really marshaling); I was
reading/writing a lot of XML, and I wanted some of it to come and go as
Ruby objects. I had code that was performing that task for me
"as-needed" but it got to be a REAL pain after awhile, and I wanted to
do something a little more out-of-the-way.

Another was native Ruby data types. I wanted to be able to generate an
XML data tree using only Hashes and Arrays, because the XML was going to
and from some other code and I didn't want to stop and write abstraction
layers. A happy medium for me was generating trees using only Ruby data
types, and then passing them off to QuiXML to be written, performing, of
course, marshaling as it went. Even though it's not a true abstraction
layer for the other code, having it all as native Ruby containers means
I can rip out QuiXML whenever I want; it's very easy to generate XML
output from a set of native containers with a known structure.

Also, I just *like* creating my trees as native Ruby data types, rather
than depending on the library itself to perform that task; it just feels
more flexible that way. I feel sort of "tied up" with the XML library
with commands like "add_element."

Another thing was XPath; it works well most of the time, but sometimes I
feel mired with it. Since my QuiXML trees often contain real Ruby
objects, rather than strings, it made more sense to produce an
addressing scheme that made use of the "case-equality" method that many
built-in objects have (String, Regex, Range, Date, etc.). The way
QuiXML's element path addressing works, you can find elements using
regular expressions, numerical ranges or even look up specific dates or
ranges. I've just been a lot happier with it; it feels more "Ruby" to me.

Also, it had to be in C. I eat through a lot of XML, and using C with
the expat library just makes it blazingly fast.

Mmm... automatic encoding/decoding of special characters like <, >, ', "
and & too.

Ah, and last but not least: I just feel a lot more productive in it. =)
I released it because I thought perhaps like-minded folks might
appreciate it. =)

Sean O'Dell

D

Dan North

Aug 28, 2003

#8

Hi Sean.

It sounds to me as though you would get much better mileage from YAML
for what you are describing. As long as you don't need to send the XML
onto any other non-Ruby systems, it's a very lightweight way of
marshalling, storing hashes and arrays, all that stuff. I've only
recently come across YAML but the more I play with it, the more I'm
liking it.

You can't do xpath stuff with it, but then you don't need to - it loads
as first-class Ruby objects.

Cheers,
Dan

W

why the lucky stiff

Aug 28, 2003

#9

You can't do xpath stuff with it, but then you don't need to - it loads
as first-class Ruby objects.

If you load a YAML document as a parse tree, you can perform YPath queries
upon it.

names:
- first: Dan
last: North
- first: Sean
last: O'Dell

EOY
=> ["Dan", "Sean"]

Not as full featured as XPath yet, but coming along...

_why

S

Sean O'Dell

Aug 28, 2003

#10

Dan said:
Hi Sean.

It sounds to me as though you would get much better mileage from YAML
for what you are describing. As long as you don't need to send the XML
onto any other non-Ruby systems, it's a very lightweight way of
marshalling, storing hashes and arrays, all that stuff. I've only
recently come across YAML but the more I play with it, the more I'm
liking it.

You can't do xpath stuff with it, but then you don't need to - it loads
as first-class Ruby objects.

Cheers,
Dan

Gavin said:

On Wednesday, August 27, 2003, 5:50:09 PM, Sean wrote [snipped]:

I've been a big fan of Sean Russell's REXML library for awhile now,
but an internal project in which I embedded the Ruby interpreter
required certain features in an XML library I couldn't quite get out
of any one of the existing API's out there, so I wrote QuiXML.

I'm curious, what features did you need that weren't already provided?

Click to expand...

Well, quite a few things I couldn't get in a single library.

Click to expand...

Well, I had already found ways to do what I needed, but as I think my
list of features illustrates, it was a matter of style and parsing
speed. Basically, I wanted fast and elegant. Slapping more and more
components together to acheive what I needed to do wasn't what I was
after. Sometimes I slap things together, when a project needs to get
done quickly, and I'm thankful for the components that make it happen,
but sometimes it's more important that a project operate swiftly and the
code be succinct.

Also, full marshaling wasn't what I needed, either. It's overkill and
the resulting strings are tied to the API that understands the objects
they represent. Having native Ruby data trees makes it easier to design
a vanilla format that QuiXML can then read and write to and from XML.
The layout of the data trees, while QuiXML-ish, are very, very ordinary
and nothing but native Ruby data is used; you can pop the root node
right off a QuiXML tree and use it happily without ever loading QuiXML
ever again.

Sean O'Dell

S

Sean O'Dell

Aug 28, 2003

#11

Just letting anyone know who's might be interested in QuiXML that there
are forums and a mailing list now at RubyForge. I have also released a
version 0.1.0, which contains some fixes for problems that popped up
when my code move from its original Windows/C++ Builder environment to
the Linux environment.

Forms:
http://rubyforge.org/forum/?group_id=63

Mailing List:
http://rubyforge.org/mailman/listinfo/quixml-users

Latest Version:
http://rubyforge.org/project/showfiles.php?group_id=63

Sean O'Dell

B

Ben Schumacher

Aug 28, 2003

#12

Sean O'Dell said:

My reason for treating content as a single value is because I've only
ever used XML for data, and never as a formatting stream, like HTML.
That's basically all there is to it, really. I used XML a lot for
configuration files, blocks of data (such as for client accounts) and
for passing commands/data between remote machines.

I'm open-minded about it though. Except for formatting streams, such as
HTML and source XML documents for XSLT to transform into something else
like PS or what-not, I couldn't think of a reason to NOT combine all of
an elements data into one content property, so I did that.

How would you like to see it changed? Give me an idea of what I could
do different. Put each scrap of content data in a child element, and
mark it as data instead of a node?

I think this could be done relatively easily. The children member variable
is a simple array, correct? And it contains hashes that map to name,
content and children, correct? Why couldn't it, then, just contain a Ruby
string if its CDATA, instead of a hash, for a node?

It would be pretty easy to disable pretty printing, of course, and also
stripping whitespace when parsing. Very simple changes.

I don't really believe in stripping whitespace, necessarily, but I bet
some folks think its a fine idea. My personal preference is to have the
ability to get out exactly what I put in -- meaning if I have some
whitespaces in places, then they probably belong there. For my use, it
makes more sense, but I understand that some folks will disagree on this
issue.

Cheers,

bs.

[ANN] parseargs-0.0.0	22	May 31, 2005
[ANN] Gerbil 0.0.0 - a document generator	8	Dec 10, 2007
[ANN] lockfile-0.0.0	2	May 7, 2004
[ANN] rubyforge-0.0.0	11	Nov 6, 2005
[ANN] nbfifo-0.0.0 - non blocking fifos for threads	0	Sep 23, 2005
[ANN] Inochi 0.0.0	4	Jan 19, 2009
[ANN] GemsOnRuby v 0.0.0 (vaporware release / RFP)	3	Mar 21, 2009
Parse XML that isn't well formed	4	Sep 19, 2007

Sean O'Dell

Joel VanderWerf

Sean O'Dell

Ben Schumacher

Sean O'Dell

Gavin Sinclair

Sean O'Dell

Dan North

why the lucky stiff

Sean O'Dell

Sean O'Dell

Ben Schumacher

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads