NoSQL Movement?

Xah Lee · Mar 3, 2010

recently i wrote a blog article on The NoSQL Movement
at http://xahlee.org/comp/nosql.html

i'd like to post it somewhere public to solicit opinions, but in the
20 min or so, i couldn't find a proper newsgroup, nor private list
that my somewhat anti-NoSQL Movement article is fitting.

So, i thought i'd post here to solicit some opinins from the programer
community i know.

Here's the plain text version

-----------------------------
The NoSQL Movement

Xah Lee, 2010-01-26

In the past few years, there's new fashionable thinking about anti
relational database, now blessed with a rhyming term: NoSQL.
Basically, it considers that relational database is outdated, and not
â€œhorizontallyâ€ scalable. I'm quite dubious of these claims.

According to Wikipedia Scalability article, verticle scalability means
adding more resource to a single node, such as more cpu, memory. (You
can easily do this by running your db server on a more powerful
machine.), and â€œHorizontal scalabilityâ€ means adding more machines.
(and indeed, this is not simple with sql databases, but again, it is
the same situation with any software, not just database. To add more
machines to run one single software, the software must have some sort
of grid computing infrastructure built-in. This is not a problem of
the software per se, it is just the way things are. It is not a
problem of databases.)

I'm quite old fashioned when it comes to computer technology. In order
to convience me of some revolutionary new-fangled technology, i must
see improvement based on math foundation. I am a expert of SQL, and
believe that relational database is pretty much the gist of database
with respect to math. Sure, a tight definition of relations of your
data may not be necessary for many applications that simply just need
store and retrieve and modify data without much concern about the
relations of them. But still, that's what relational database
technology do too. You just don't worry about normalizing when you
design your table schema.

The NoSQL movement is really about scaling movement, about adding more
machines, about some so-called â€œcloud computingâ€ and services with
simple interfaces. (like so many fashionable movements in the
computing industry, often they are not well defined.) It is not really
about anti relation designs in your data. It's more about adding
features for practical need such as providing easy-to-user APIs (so
you users don't have to know SQL or Schemas), ability to add more
nodes, provide commercial interface services to your database, provide
parallel systems that access your data. Of course, these needs are all
done by any big old relational database companies such as Oracle over
the years as they constantly adopt the changing industry's needs and
cheaper computing power. If you need any relations in your data, you
can't escape relational database model. That is just the cold truth of
math.

Importat data, such as used in the bank transactions, has relations.
You have to have tight relational definitions and assurance of data
integrity.

Here's a second hand quote from Microsoft's Technical Fellow David
Campbell. Source

I've been doing this database stuff for over 20 years and I
remember hearing that the object databases were going to wipe out
the SQL databases. And then a little less than 10 years ago the
XML databases were going to wipe out.... We actually ... you
know... people inside Microsoft, [have said] 'let's stop working
on SQL Server, let's go build a native XML store because in five
years it's all going....'

LOL. That's exactly my thought.

Though, i'd have to have some hands on experience with one of those
new database services to see what it's all about.

--------------------
Amazon S3 and Dynamo

Look at Structured storage. That seems to be what these nosql
databases are. Most are just a key-value pair structure, or just
storage of documents with no relations. I don't see how this differ
from a sql database using one single table as schema.

Amazon's Amazon S3 is another storage service, which uses Amazon's
Dynamo (storage system), indicated by Wikipedia to be one of those
NoSQL db. Looking at the S3 and Dynamo articles, it appears the db is
just a Distributed hash table system, with added http access
interface. So, basically, little or no relations. Again, i don't see
how this is different from, say, MySQL with one single table of 2
columns, added with distributed infrastructure. (distributed database
is often a integrated feature of commercial dbs, e.g. Wikipedia Oracle
database article cites Oracle Real Application Clusters )

Here's a interesting quote on S3:

Bucket names and keys are chosen so that objects are addressable
using HTTP URLs:

* http://s3.amazonaws.com/bucket/key
* http://bucket.s3.amazonaws.com/key
* http://bucket/key (where bucket is a DNS CNAME record
pointing to bucket.s3.amazonaws.com)

Because objects are accessible by unmodified HTTP clients, S3 can
be used to replace significant existing (static) web hosting
infrastructure.

So this means, for example, i can store all my images in S3, and in my
html document, the inline images are just normal img tags with normal
urls. This applies to any other type of file, pdf, audio, but html
too. So, S3 becomes the web host server as well as the file system.

Here's Amazon's instruction on how to use it as image server. Seems
quite simple: How to use Amazon S3 for hosting web pages and media
files? Source

--------------------
Google BigTable

Another is Google's BigTable. I can't make much comment. To make a
sensible comment, one must have some experience of actually
implementing a database. For example, a file system is a sort of
database. If i created a scheme that allows me to access my data as
files in NTFS that are distributed over hundreds of PC, communicated
thru http running Apache. This will let me access my files. To insert,
delete, data, one can have cgi scripts on each machine. Would this be
considered as a new fantastic NoNoSQL?

---------------------

comments can also be posted to
http://xahlee.blogspot.com/2010/01/nosql-movement.html

Thanks.

Xah
âˆ‘ http://xahlee.org/

â˜„

MRAB · Mar 3, 2010

Xah said:
recently i wrote a blog article on The NoSQL Movement
at http://xahlee.org/comp/nosql.html

i'd like to post it somewhere public to solicit opinions, but in the
20 min or so, i couldn't find a proper newsgroup, nor private list
that my somewhat anti-NoSQL Movement article is fitting.

So, i thought i'd post here to solicit some opinins from the programer
community i know.

[snip]
Couldn't find a relevant newsgroup, so decided to inflict it on a number
of others...

ccc31807 · Mar 3, 2010

recently i wrote a blog article on The NoSQL Movement
athttp://xahlee.org/comp/nosql.html

i'd like to post it somewhere public to solicit opinions, but in the
20 min or so, i couldn't find a proper newsgroup, nor private list
that my somewhat anti-NoSQL Movement article is fitting.

I only read the first two paragraphs of your article, so I can't
respond to it.

I've halfway followed the NoSQL movement. My day job is a database
manager and I so SQL databases for a living, as well as Perl. I see a
lot of abuse of relational databases in the Real World, as well as a
lot of abuse for non-SQL alternatives, e.g., (mis)using Excel for a
database. The big, enterprise database we have at work is built on IBM
UniQuery, which is a non-SQL flat file database product, so I've had a
lot of experience with big non-SQL database work.

I've also developed a marked preference for plain text databases. For
a lot of applications they are simpler, easier, and better. I've also
had some experience with XML databases, and find that they are ideal
for applications with 'ragged' data.

As with anything else, you need to match the tool to the job. Yes, I
feel that relational database technology has been much used, and much
abused. However, one of my favorite applications is Postgres, and I
think it's absolutely unbeatable where you have to store data and
perform a large number of queries.

Finally, with regard to Structured Query Language itself, I find that
it's well suited to its purpose. I hand write a lot of SQL statements
for various purposes, and while like any language you find it
exceedingly difficult to express concepts that you can think, it
mostly allows the expression of most of what you want to say.

CC.

toby · Mar 3, 2010

I only read the first two paragraphs of your article, so I can't
respond to it.

I've halfway followed the NoSQL movement. My day job is a database
manager and I so SQL databases for a living, as well as Perl. I see a
lot of abuse of relational databases in the Real World, as well as a
lot of abuse for non-SQL alternatives, e.g., (mis)using Excel for a
database. The big, enterprise database we have at work is built on IBM
UniQuery, which is a non-SQL flat file database product, so I've had a
lot of experience with big non-SQL database work.

I've also developed a marked preference for plain text databases. For
a lot of applications they are simpler, easier, and better. I've also
had some experience with XML databases, and find that they are ideal
for applications with 'ragged' data.

As with anything else, you need to match the tool to the job. Yes, I
feel that relational database technology has been much used, and much
abused. However, one of my favorite applications is Postgres, and I
think it's absolutely unbeatable

It is beatable outside of its sweetspot, like any system. NoSQL is not
so much about "beating" relational databases, as simply a blanket term
for useful non-relational technologies. There's not much point in
reading Xah beyond the heading of his manifesto, as it is no more
relevant to be "anti-NoSQL" as to be "anti-integers" because they
don't store fractions.

where you have to store data and

"relational data"

perform a large number of queries.

Why does the number matter?

Jonathan Gardner · Mar 3, 2010

As with anything else, you need to match the tool to the job. Yes, I
feel that relational database technology has been much used, and much
abused. However, one of my favorite applications is Postgres, and I
think it's absolutely unbeatable where you have to store data and
perform a large number of queries.

Let me elaborate on this point for those who haven't experienced this
for themselves.

When you are starting a new project and you don't have a definitive
picture of what the data is going to look like or how it is going to
be queried, SQL databases (like PostgreSQL) will help you quickly
formalize and understand what your data needs to do. In this role,
these databases are invaluable. I can see no comparable tool in the
wild, especially not OODBMS.

As you grow in scale, you may eventually reach a point where the
database can't keep up with you. Either you need to partition the data
across machines or you need more specialized and optimized query
plans. When you reach that point, there are a number of options that
don't include an SQL database. I would expect your project to move
those parts of the data away from an SQL database and towards a more
specific solution.

I see it as a sign of maturity with sufficiently scaled software that
they no longer use an SQL database to manage their data. At some point
in the project's lifetime, the data is understood well enough that the
general nature of the SQL database is unnecessary.

Avid Fan · Mar 3, 2010

Jonathan said:
I see it as a sign of maturity with sufficiently scaled software that
they no longer use an SQL database to manage their data. At some point
in the project's lifetime, the data is understood well enough that the
general nature of the SQL database is unnecessary.

I am really struggling to understand this concept.

Is it the normalised table structure that is in question or the query
language?

Could you give some sort of example of where SQL would not be the way to
go. The only things I can think of a simple flat file databases.

Philip Semanchuk · Mar 3, 2010

I am really struggling to understand this concept.

Is it the normalised table structure that is in question or the
query language?

Could you give some sort of example of where SQL would not be the
way to go. The only things I can think of a simple flat file
databases.

Well, Zope is backed by an object database rather than a relational one.

Jack Diederich · Mar 3, 2010

[snip]

Xah Lee is a longstanding usenet troll. Don't feed the trolls.

mk · Mar 4, 2010

Jonathan said:
When you are starting a new project and you don't have a definitive
picture of what the data is going to look like or how it is going to
be queried, SQL databases (like PostgreSQL) will help you quickly
formalize and understand what your data needs to do. In this role,
these databases are invaluable. I can see no comparable tool in the
wild, especially not OODBMS.

FWIW, I talked to my promoting professor about the subject, and he
claimed that there's quite a number of papers on OODBMS that point to
fundamental problems with constructing capable query languages for
OODBMS. Sadly, I have not had time to get & read those sources.

Regards,
mk

ccc31807 · Mar 4, 2010

"relational data"

Data is neither relational nor unrelational. Data is data.
Relationships are an artifact, something we impose on the data.
Relations are for human convenience, not something inherent in the
data itself.

Why does the number matter?

Have you ever had to make a large number of queries to an XML
database? In some ways, an XML database is the counterpart to a
relational database in that the data descriptions constitute the
relations. However, since the search is to the XML elements, and you
can't construct indicies for XML databases in the same way you can
with relational databases, a large search can take much longer that
you might expect.

CC.

George Neuner · Mar 4, 2010

No, relations are data. "Data is data" says nothing. Data is
information. Actually, all data are relations: relating /values/ to
/properties/ of /entities/. Relations as understood by the "relational
model" is nothing else but assuming that properties and entities are
first class values of the data system and the can also be related.

Well ... sort of. Information is not data but rather the
understanding of something represented by the data. The term
"information overload" is counter-intuitive ... it really means an
excess of data for which there is little understanding.

Similarly, at the level to which you are referring, a relation is not
data but simply a theoretical construct. At this level testable
properties or instances of the relation are data, but the relation
itself is not. The relation may be data at a higher level.

George

ccc31807 · Mar 4, 2010

No, relations are data.

This depends on your definition of 'data.' I would say that
relationships is information gleaned from the data.

"Data is data" says nothing. Data is
information.

To me, data and information are not the same thing, and in particular,
data is NOT information. To me, information consists of the sifting,
sorting, filtering, and rearrangement of data that can be useful in
completing some task. As an illustration, consider some very large
collection of ones and zeros -- the information it contains depends on
whether it's views as a JPEG, an EXE, XML, WAV, or other sort of
information processing device. Whichever way it's processed, the
'data' (the ones and zeros) stay the same, and do not constitute
'information' in their raw state.

Actually, all data are relations: relating /values/ to
/properties/ of /entities/. Relations as understood by the "relational
model" is nothing else but assuming that properties and entities are
first class values of the data system and the can also be related.

Well, this sort of illustrates my point. The 'values' of 'properties'
relating to specific 'entities' depends on how one processes the data,
which can be processed various ways. For example, 10000001 can either
be viewed as the decimal number 65 or the alpha character 'A' but the
decision as to how to view this value isn't inherent in the data
itself, but only as an artifact of our use of the data to turn it into
information.

CC.

Tim X · Mar 4, 2010

ccc31807 said:
Data is neither relational nor unrelational. Data is data.
Relationships are an artifact, something we impose on the data.
Relations are for human convenience, not something inherent in the
data itself.

Have you ever had to make a large number of queries to an XML
database? In some ways, an XML database is the counterpart to a
relational database in that the data descriptions constitute the
relations. However, since the search is to the XML elements, and you
can't construct indicies for XML databases in the same way you can
with relational databases, a large search can take much longer that
you might expect.

Most XML databases are just a re-vamp of hierarchical databases, which are
one of the two common formats that came before relational databases.
Hierarchical, network and relational databases all have their uses.

Some 'xml' databases, like existsdb have some pretty powerful indexing
technologies. while they are different to relational db indexing because
they are based around hierarchies rather than relations, they do provide
the ability to do fast queries in the same way that indexes in
relational databases allow fast queries over relations. Both solutions
can do fast queries, they are just optimised for different types of
queries. Likewise, other database technologies that tend to fall into
this category, such as couch and mungo are aimed at applications and
problems that aren't suitable for the relational db model and are better
suited to the types of applications they have been designed for.

As usual, Xah's rantings are of little substance here. Yes, he is right that
'nosql' is essentially just another buzzword like 'web 2.0', but so
what? This is an industry that loves its buzzwords.Often its just
marketing hype or some convenience holder for a vague 'concept' some
journalist, consultant or blogger wants to wank on about.

You cannot hate or love 'nosql' without defining exactly what you mean
by the term. Xah starts by acknowledging the term is ill defined
and then goes on to say how he doesn't like it because it lacks the
mathematical precision of the relational algebra that underpins the
relational model. It seems somewhat ironic to put forward an argument
focusing on the importance of precision when you fail to be precise
regarding the thing your arguing against.

His point is further weakened by the failure to realise that SQL and the
relational model and relational algebra are different things. Not having
SQL doesn't automatically mean you cannot have a relational model or
operations that are based on relational algebra. SQL is just the
convenient query language and while it has succeeded where other
languages have not, its just one way of interacting with a relational
database. As a language SQL isn't even 'pure' in that it has operations
that don't fit with the relational algebra that he claims is so
important and includes facillities that are really business convenience
operations that actually corrupt the mathematical model and purity that
is the basis of his poorly formed argument. He also overlooks the fact
that none of the successful relational databases have remained true to
either the relational model or the underlying theory. All of the major
RDMS have corrupted things for marketing, performance or maintenance
reasons. Only a very few vendors have stayed true to the relational
model and none of them have achieved much in the way of market share, I
wonder why?

All Xah is doing is being the nets equivalent of radios 'shock jock'. He
searches for some topical issue, identifies a stance that he feels will
elicit the greatest number of emotional responses and lobs it into the
crowd. He rarely hangs around to debate his claims. When he does, he
tends to just yell and shout and more often than not, uses personal
attack to defend his statements rather than arguing the topic. His
analysis is usually shallow and based on popularism If
someone disagrees, they are a moron or a fool and if they agree, they
are a genius just like him.

Just like true radio shock jocks, some willl love him and some will hate
him. The only things we can be certain about are that reaction is
a much hier motivator for his posts than conviction, there is
probably an inverse relationship between IQ and support for his
arguments and that his opinion probably has the same longevity as the
term nosql.

Now if we can just get back to debating important topics like why
medical reform is the start of communism, how single mothers are
leeching of tax payers, the positive aspects of slavery, why blacks are
all criminals, how governments are evil, the holocaust conspiricy, why
all muslims are terrorists, the benefits of global warming, the bad
science corrupting our children's innocence and stop wasting time
debating this technology stuff and please people, listen to your harts
and follow your gut - don't let your intellect or so called facts get in
the way, trust your emotions.

Tim

John Nagle · Mar 5, 2010

Xah said:
recently i wrote a blog article on The NoSQL Movement
at http://xahlee.org/comp/nosql.html

i'd like to post it somewhere public to solicit opinions, but in the
20 min or so, i couldn't find a proper newsgroup, nor private list
that my somewhat anti-NoSQL Movement article is fitting.

Too much rant, not enough information.

There is an argument against using full relational databases for
some large-scale applications, ones where the database is spread over
many machines. If the database can be organized so that each transaction
only needs to talk to one database machine, the locking problems become
much simpler. That's what BigTable is really about.

For many web applications, each user has more or less their own data,
and most database activity is related to a single user. Such
applications can easily be scaled up with a system that doesn't
have inter-user links. There can still be inter-user references,
but without a consistency guarantee. They may lead to dead data,
like Unix/Linux symbolic links. This is a mechanism adequate
for most "social networking" sites.

There are also some "consistent-eventually" systems, where a query
can see old data. For non-critical applications, those can be
very useful. This isn't a SQL/NoSQL thing; MySQL asynchronous
replication is a "consistent-eventually" system. Wikipedia uses
that for the "special" pages which require database lookups.

If you allow general joins across any tables, you have to have all
the very elaborate interlocking mechanisms of a distributed database.
The serious database systems (MySQL Cluster and Oracle, for example)
do offer that, but there are usually
substantial complexity penalties, and the databases have to be carefully
organized to avoid excessive cross-machine locking. If you don't need
general joins, a system which doesn't support them is far simpler.

John Nagle

Bruno Desthuilliers · Mar 5, 2010

Philip Semanchuk a écrit :

Well, Zope is backed by an object database rather than a relational one.

And it ended up being a *major* PITA on all Zope projects I've worked on...

mk · Mar 5, 2010

Bruno said:
And it ended up being a *major* PITA on all Zope projects I've worked on...

Care to write a few sentences on nature of problems with zodb? I was
flirting with the thought of using it on some project.

Regards,
mk

floaiza · Mar 7, 2010

I don't think there is any doubt about the value of relational
databases, particularly on the Internet. The issue in my mind is how
to leverage all the information that resides in the "deep web" using
strictly the relational database paradigm.

Because that paradigm imposes a tight and rigid coupling between
semantics and syntax when you attempt to efficiently "merge" or
"federate" data from disparate sources you can find yourself spending
a lot of time and money building mappings and maintaining translators.

That's why approaches that try to separate syntax from the semantics
are now becoming so popular, but, again, as others have said, it is
not a matter of replacing one with the other, but of figuring out how
best to exploit what each technology offers.

I base my remarks on some initial explorations I have made on the use
of RDF Triple Stores, which, by the way, use RDBMSs to persist the
triples, but which offer a really high degree of flexibility WRT the
merging and federating of data from different semantic spaces.

The way I hope things will move forward is that eventually it will
become inexpensive and easy to "expose" as RDF triples all the
relevant data that now sits in special-purpose databases.

(just an opinion)

Francisco

recently i wrote a blog article on The NoSQL Movement
athttp://xahlee.org/comp/nosql.html

i'd like to post it somewhere public to solicit opinions, but in the
20 min or so, i couldn't find a proper newsgroup, nor private list
that my somewhat anti-NoSQL Movement article is fitting.

So, i thought i'd post here to solicit some opinins from the programer
community i know.

Here's the plain text version

-----------------------------
The NoSQL Movement

Xah Lee, 2010-01-26

In the past few years, there's new fashionable thinking about anti
relational database, now blessed with a rhyming term: NoSQL.
Basically, it considers that relational database is outdated, and not
â€œhorizontallyâ€ scalable. I'm quite dubious of these claims.

According to Wikipedia Scalability article, verticle scalability means
adding more resource to a single node, such as more cpu, memory. (You
can easily do this by running your db server on a more powerful
machine.), and â€œHorizontal scalabilityâ€ means adding more machines.
(and indeed, this is not simple with sql databases, but again, it is
the same situation with any software, not just database. To add more
machines to run one single software, the software must have some sort
of grid computing infrastructure built-in. This is not a problem of
the software per se, it is just the way things are. It is not a
problem of databases.)

I'm quite old fashioned when it comes to computer technology. In order
to convience me of some revolutionary new-fangled technology, i must
see improvement based on math foundation. I am a expert of SQL, and
believe that relational database is pretty much the gist of database
with respect to math. Sure, a tight definition of relations of your
data may not be necessary for many applications that simply just need
store and retrieve and modify data without much concern about the
relations of them. But still, that's what relational database
technology do too. You just don't worry about normalizing when you
design your table schema.

The NoSQL movement is really about scaling movement, about adding more
machines, about some so-called â€œcloud computingâ€ and services with
simple interfaces. (like so many fashionable movements in the
computing industry, often they are not well defined.) It is not really
about anti relation designs in your data. It's more about adding
features for practical need such as providing easy-to-user APIs (so
you users don't have to know SQL or Schemas), ability to add more
nodes, provide commercial interface services to your database, provide
parallel systems that access your data. Of course, these needs are all
done by any big old relational database companies such as Oracle over
the years as they constantly adopt the changing industry's needs and
cheaper computing power. If you need any relations in your data, you
can't escape relational database model. That is just the cold truth of
math.

Importat data, such as used in the bank transactions, has relations.
You have to have tight relational definitions and assurance of data
integrity.

Here's a second hand quote from Microsoft's Technical Fellow David
Campbell. Source

Â Â I've been doing this database stuff for over 20 years and I
Â Â remember hearing that the object databases were going to wipe out
Â Â the SQL databases. And then a little less than 10 years ago the
Â Â XML databases were going to wipe out.... We actually ... you
Â Â know... people inside Microsoft, [have said] 'let's stop working
Â Â on SQL Server, let's go build a native XML store because in five
Â Â years it's all going....'

LOL. That's exactly my thought.

Though, i'd have to have some hands on experience with one of those
new database services to see what it's all about.

--------------------
Amazon S3 and Dynamo

Look at Structured storage. That seems to be what these nosql
databases are. Most are just a key-value pair structure, or just
storage of documents with no relations. I don't see how this differ
from a sql database using one single table as schema.

Amazon's Amazon S3 is another storage service, which uses Amazon's
Dynamo (storage system), indicated by Wikipedia to be one of those
NoSQL db. Looking at the S3 and Dynamo articles, it appears the db is
just a Distributed hash table system, with added http access
interface. So, basically, little or no relations. Again, i don't see
how this is different from, say, MySQL with one single table of 2
columns, added with distributed infrastructure. (distributed database
is often a integrated feature of commercial dbs, e.g. Wikipedia Oracle
database article cites Oracle Real Application Clusters )

Here's a interesting quote on S3:

Â Â Bucket names and keys are chosen so that objects are addressable
Â Â using HTTP URLs:

Â Â Â Â *http://s3.amazonaws.com/bucket/key
Â Â Â Â *http://bucket.s3.amazonaws.com/key
Â Â Â Â *http://bucket/key(where bucket is a DNS CNAME record
pointing to bucket.s3.amazonaws.com)

Â Â Because objects are accessible by unmodified HTTP clients, S3 can
Â Â be used to replace significant existing (static) web hosting
Â Â infrastructure.

So this means, for example, i can store all my images in S3, and in my
html document, the inline images are just normal img tags with normal
urls. This applies to any other type of file, pdf, audio, but html
too. So, S3 becomes the web host server as well as the file system.

Here's Amazon's instruction on how to use it as image server. Seems
quite simple: How to use Amazon S3 for hosting web pages and media
files? Source

--------------------
Google BigTable

Another is Google's BigTable. I can't make much comment. To make a
sensible comment, one must have some experience of actually
implementing a database. For example, a file system is a sort of
database. If i created a scheme that allows me to access my data as
files in NTFS that are distributed over hundreds of PC, communicated
thru http running Apache. This will let me access my files. To insert,
delete, data, one can have cgi scripts on each machine. Would this be
considered as a new fantastic NoNoSQL?

---------------------

comments can also be posted tohttp://xahlee.blogspot.com/2010/01/nosql-movement.html

Thanks.

Â Xah
âˆ‘http://xahlee.org/

â˜„

Xah Lee · Mar 8, 2010

many people mentioned scalibility... though i think it is fruitful to
talk about at what size is the NoSQL databases offer better
scalability than SQL databases.

For example, consider, if you are within world's top 100th user of
database in terms of database size, such as Google, then it may be
that the off-the-shelf tools may be limiting. But how many users
really have such massive size of data?

note that google's need for database today isn't just a seach engine.
It's db size for google search is probably larger than all the rest of
search engine company's sizes combined. Plus, there's youtube (vid
hosting), gmail, google code (source code hosting), google blog, orkut
(social networking), picasa (photo hosting), etc, each are all ranked
within top 5 or so with respective competitors in terms of number of
accounts... so, google's datasize is probably number one among the
world's user of databases, probably double or triple than the second
user with the most large datasize. At that point, it seems logical
that they need their own db, relational or not.

Xah
âˆ‘ http://xahlee.org/

â˜„

Xah Lee · Mar 9, 2010

You've totally missed the point. It isn't the size of the data you have
today that matters, it's the size of data you could have in several years'
time.

so, you saying, in several years, we'd all become the world's top 100
database users in terms of size, like Google?

Xah
âˆ‘ http://xahlee.org/

â˜„

Bruno Desthuilliers · Mar 9, 2010

mk a écrit :

Care to write a few sentences on nature of problems with zodb? I was
flirting with the thought of using it on some project.

Would require more than a few sentences. But mostly, it's about the very
nature of the Zodb : it's a giant graph of Python objects.

So :
1/ your "data" are _very_ tightly dependant on the language and
applicative code
2/ you have to hand-write each and every graph traversal
3/ accessing a given object usually loads quite a few others in memory

I once thought the Zodb was cool.

Designing a Pythonic search DSL for SQL and NoSQL databases	2	Jul 19, 2013
Accidentally Corrupted MySQL Tables While Migrating Servers Need Advice!	0	Jun 22, 2026
[BarCamp] WebWorkersCamp BarCamp: NodeJS, NoSQL, Message Queues,Asynchronous programming, Web Socket	0	May 17, 2010
Python web-framework+db with the widest scalability?	1	May 12, 2012
Best Method to Recover Corrupted SQL Database Files	1	Jan 7, 2025
XML python to database	3	Nov 1, 2013
Upgrading Company's Internal Record Keeping Systems	0	Sep 24, 2021
PHP/MySQL UPDATE not working for second table	1	Jan 11, 2020

NoSQL Movement?

Xah Lee

MRAB

ccc31807

toby

Jonathan Gardner

Avid Fan

Philip Semanchuk

Jack Diederich

mk

ccc31807

George Neuner

ccc31807

Tim X

John Nagle

Bruno Desthuilliers

mk

floaiza

Xah Lee

Xah Lee

Bruno Desthuilliers

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads