Huge XML data needed

  • Thread starter Beda Christoph Hammerschmidt
  • Start date
B

Beda Christoph Hammerschmidt

I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??
 
A

Andy Dingley

Any idea, where i can get this data ??

Make it yourself. That way you can control the size and the
distribution of certain features. If this process is automated, then
you can easily run tests over and over with different parameters.

It's often useful (but rarely done) to test, not just that "it works",
but to test for sensitivity to different sorts of load. Does
performance change with many small items, or with few large items ?
Does sorted/unsorted input data make a difference ?

Another source of "real world" data in a large corporate is to connect
to something like an LDAP server and use that. I've also done much of
my own testing with lists of endangered species form the WCMC. You may
also find the W3C site useful, particularly the RDF test cases (not
large, but they do demonstrate many obscure conditions).
 
A

Arto Viitanen

Beda> I wat to perform some performance measurements on an XML database. FOr
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

You might get some RSS feed. RSS is a form used by several news servers to
distribute news. So by definition, there is not much structure, but you can
make reasonable queries, like what happed (some terrorist act), what was the
score (some soccer game) etc.
 
T

Toivo Lainevool

Beda Christoph Hammerschmidt wrote in message news: said:
I wat to perform some performance measurements on an XML database. FOr
this reason i need some huge XML sample data. The data should be not
too structured and a lot of reasonable queries should make sense.
Any idea, where i can get this data ??

I'm not sure what you mean by "huge", but there is a good amount of
data that might be intersting to query at:
http://www.ibiblio.org/xml/examples/shakespeare/

Toivo Lainevool
http://www.XMLPatterns.com - Develop effective DTDs and XML Schema
documents for your XML using structural design patterns.
 
J

Johannes Koch

Arto said:
Beda> I wat to perform some performance measurements on an XML database. FOr
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

You might get some RSS feed.

But RSS - by definition - is not "huge XML data".
 
A

Arto Viitanen

Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??
Johannes> But RSS - by definition - is not "huge XML data".

But I got two out of third: it is not too structured and there can be
reasonable queries !
 
J

Johannes Koch

Arto said:
Beda> this reason i need some huge XML sample data. The data should be not
Beda> too structured and a lot of reasonable queries should make sense. Any
Beda> idea, where i can get this data ??

Johannes> But RSS - by definition - is not "huge XML data".

But I got two out of third: it is not too structured and there can be
reasonable queries !

That's right :)
 
A

Akmal B. Chaudhri

Why don't you generate them ?

Good idea. There are 5 major XML DB Benchmark efforts. Some include data
generators. See:

http://www.rpbourret.com/xml/XMLDBLinks.htm#Benchmarks

Ron Bourret has a link to a benchmark page that I use to maintain, but I
no longer have time to maintain it.
Use a free-db like MySQL...

Some benchmarks and performance issues are also covered in the book I
helped edit:

A.B. Chaudhri, A. Rashid and R. Zicari (eds.) (2003) XML data management:
native XML and XML-enabled database systems (Reading, Massachusetts:
Addison-Wesley)

http://www.awprofessional.com/titles/0201844524/

HTH

akmal
 
P

Patrick TJ McPhee

% Today's RSS feed bug was this
% http://www.littlefluffy.com/index.php?a=rss
%
% <description>
% A more aptly named game you are not likely to find. [...] a great
% game for drug users &lt;em>and&lt;/em> kids.
% </description>

So, what's wrong with it? That <em> should appear as mark-up, or that
you think > shouldn't be there?
 
A

Andy Dingley

% game for drug users &lt;em>and&lt;/em> kids.
So, what's wrong with it? That <em> should appear as mark-up, or that
you think > shouldn't be there?

There's no valid encoding for HTML in RSS where the opening character
of a tag is escaped, but not the closing character.
 
P

Patrick TJ McPhee

% On Sun, 4 Apr 2004 03:00:54 +0200 (MEST), (e-mail address removed) (Patrick
% TJ McPhee) wrote:
%
% >% game for drug users &lt;em>and&lt;/em> kids.
%
% >So, what's wrong with it? That <em> should appear as mark-up, or that
% >you think > shouldn't be there?
%
% There's no valid encoding for HTML in RSS where the opening character
% of a tag is escaped, but not the closing character.

OK, perhaps it's not valid RSS, but it's valid XML.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top