Issues with unique object IDs in persistence

Lew · May 16, 2009

Seamus said:
Didn't your mother tell you not to believe everything you read on the
web?

Are you trying to claim that Derby takes a different footprint? Evidence?

There's no call for personal attacks here, especially when you just

Personal attack? I attacked the reasoning, not the person. The intentional
twisting of what I said was ridiculous. You knew if you had read the earlier
posts that I was using the term "embedded" the same way that Derby does in
their documentation.

admonished a bunch of comp.lang.lispers for exactly the same behavior.

"Admonished"? Say, rather, "goaded".

You said, and I quote, "an embedded application". That pretty
unambiguously means the application runs in a dedicated appliance like a
set-top box, to most programmers.

Nonsense.

It pretty ambiguously means what it means in the context of the conversation,
in which I had already used the term to mean "embedded database application
within a Java application".

That would be an "embedded database" in an application that may, or may
not, itself be embedded, rather than an "embedded application".

Pardon me. Please understand that I assumed you would follow the context and
not twist it. I shall avoid that mistake. Rest assured I intended "embedded"
in the sense we'd already been using the term in this conversation, and not in
a new way.

You've asserted, not discussed, this implausible claim. An actual
argument to support it would be far more interesting than another random
personal attacks.

Projection. You asserted that the application would get "screwed up" and
would require "an uninstall/reinstall loop". Reading the Derby documentation,
to which I referred earlier, you will see that that isn't required with a
Derby-embedded application.

So tell me: Why do you think the DB would be bulletproof, uncorruptable
even by bugs in the client code?

I never said that I did think that. I said it was likely to be more stable
and easier to keep free of bugs than a custom disk-based solution. Don't put
words in my mouth. Your straw-man arguments are too transparent.

What sort of association? If he just meant objects referencing other
objects would go together, Java serialization does that already.

Surely you read the post. He wants to "assign each object a unique ID of some
kind". You responded with advice about hash codes and a long, complex idea
for that association involving a custom disk solution.

No, I'm calmly stating common-place and well-known facts about databases.

Except that none of that has to do with Derby.

Apparently there's no actual database in this DBMS, then. How interesting.

Did you read the Derby documentation yet?

The risks and headaches involved in repartitioning a hard disk. Haven't
you been paying attention?

Derby doesn't require that you repartition a disk. What are you on about?

Yes, it was; please try to be a bit more focused the next time you post.
Heavier on the rational argumentation and lighter on the personal jabs,
particularly, if you please.

No personal jab. I commented on the speech, not the person.

Until something goes wrong.

Unless you honestly intend to make the outlandish claim that nothing
will ever, ever go wrong.

I've stated that that's not what I'm saying. Why would you read that into my
statements? I only said that the risk was lower than with a custom disk
solution, not that it was zero.

But try telling that to anyone who's ever had to muck about with the
Windows registry, Firefox profiles, or pretty much anything else of that
nature, based on instructions off some web site or read to them over the
phone by tech support.

It would be the application developer who fixes the problem with a
Derby-embedded system, not the user.

No, I said that if an application's target audience is sysadmins, it can
get away with having a much more cryptic user interface and
harder-to-fix problems that need more technical monkeying to correct
than if it's target audience includes Uncle Bob and Aunt Mathilda.

And it would be the developer who fixes the Derby-embedded application, not a
sysadmin and not the user, same as with the disk-based solution.

Sure it did. You are the one who didn't address the point. When you
aren't changing the subject to ease of implementation or my lack of
specific knowledge about Derby, you're claiming the application
magically takes care of everything.

I never said any such thing. I said the application takes care of things
rather than the user. Stop straw-manning my arguments. It's intellectually
dishonest.

If that were possible, wouldn't Microsoft have made Windows able to
magically take care of the registry so that users never had to deal with
it, ever?

They could have and should have.

"Nothing will go wrong", or "It will all work out somehow -- trust me",
when asked how the user is to fix things when they go wrong, is not
addressing the point. It is a cop-out.

Good thing I didn't say that, then.

Database: easier implementation, harder end-user servicing if it gets
corrupted or otherwise screwed up.
Normal disk files: harder implementation, easier end-user servicing.

That's backwards.

Your only response to that, besides lying by saying I never mentioned
it, has been the incredibly dubious assertion that there will never be a
need for end-user servicing. Maybe if it's a "hello, world" program. In
which case a database, however "embedded", is way overkill.

Again with putting words in my mouth.

Sorry -- don't have the time tonight to download fifty megs or so of
whosits and whatsits. Too many plates to keep spinning. Maybe tomorrow.

the typical cop-out of the one whose point has been disproven. "You didn't
give evidence. Oh, wait, you gave evidence, but I refuse to look at it."

Making the programmer's job easier reduces the risk of the programmer making
the user's life harder.

If we accept your claim, there'd be fewer bugs, but harder for the user
to recover from. It all boils down to how many fewer, and how much

I think it would be easier to prevent bugs with a well thought-out, thoroughly
tested and robust system like Derby than with a brand-new complex custom
disk-based solution.

harder, and which ends up outweighing which, doesn't it? Which probably
Nope.

depends on the particular application, its nature and its user-base's
technical sophistication particularly. Which was my contention all along.

You bet on the solution you think will work better, and I'll bet on the one I
think will work better. Were I to implement a system that requires
association of persistent data, I'd use a database, and that's how I've gone
when faced with that decision. YMMV.

That was my opinion from the outset. Are you now telling me you've been
violently *agreeing* with me the entire time?

Had you not been busy putting words in my mouth in order to "disprove" them,
you'd've seen that I've been talking about risks and probabilities all along,
not the absolutes that you've attributed to me. I speak of the type of
decision I've faced and made myself in my programming career. I don't think I
can make my points any stronger - the Derby documentation speaks for itself;
if you aren't too lazy to review it you will see that. Other embedded
databases exist; I'm not particularly partisan to Derby other than to note its
convenience as part of the standard Java distribution and good reputation over
its years of existence. Feel free to try a different solution in your
practice. I saw from your post upthread that you have the technical knowledge
to carry it off. I htink doing it the other way increases risk to your
customers and makes your programming job harder. You disagree. That's fair.

Lew · May 16, 2009

Lew said:
It pretty ambiguously means what it means in the context of the

I meant "unambiguously".

Seamus MacRae · May 16, 2009

Lew said:
Are you trying to claim that Derby takes a different footprint?

No, just that your word for it and some web site's isn't proof that it
doesn't.

Personal attack? I attacked the reasoning, not the person.

Attacking my reasoning might as well be attacking me, in this place
where my reasoning is all that's visible of me.

The intentional twisting of what I said was ridiculous.

If such had occurred, perhaps it would have been.

You knew if you had read the earlier posts that I was using the term
"embedded" the same way that Derby does in their documentation.

No; previously you'd talked about an embedded database IN an
application. Then you mentioned an embedded application. Different beast
entirely. Perhaps you meant to say something else, but I can only go by
what you actually did say.

"Admonished"? Say, rather, "goaded".

I'll say whatever I bloody well please. We have free speech here.

Nonsense.

http://www.answers.com/topic/embedded-application
"An application that permanently resides in an industrial or consumer
device ..."
http://en.wikipedia.org/wiki/Embedded_system
"... Physically, embedded systems range from portable devices such as
digital watches ... to large stationary installations like traffic
lights ..."
http://www.onesmartclick.com/rtos/embedded-system-application.html
"... in factory equipment and home electronics ..."
http://msdn.microsoft.com/en-us/windowsmobile/default.aspx
"... mobile devices ..."
http://www.rabbit.com/products/Embedded_PLC_App_Kit/
"... factory assembly lines ..."
http://www.ranosofttechnologies.com/application_embedded.htm
"... An embedded design, is an electronic design that contains an
embedded micro-controller ... There are many different CPU
architectures used in embedded designs. This in contrast to the
desktop computer market ..."
http://java.sun.com/javase/embedded/
"... autonomous vehicle ..."
http://blogs.msdn.com/embedded/archive/2009/05/08/application-development-for-embedded-devices.aspx
"... Very often such devices need to be integrated into backend
infrastructure ..."
http://encyclopedia2.thefreedictionary.com/embedded+application
"... An application that permanently resides in an industrial or
consumer device ..."
http://www.ibm.com/developerworks/linux/library/l-embl.html
"... wrist watch, hand-held devices (PDAs and cell phones), Internet
appliances, thin clients, firewalls, industrial robotics, telephony
infrastructure equipment ..."

Well, it looks like ten out of ten of the top ten Google hits for
"embedded application" (unquoted, even) are also spouting nonsense,
then, including such well-known fonts of nonsense as IBM, Microsoft, and
Sun Microsystems. (The Wikipedia page might have been vandalized,
however.

)

Pardon me.

You're excused.

Rest assured I intended "embedded" in the sense we'd already been
using the term in this conversation, and not in a new way.

Then why did you use it in a new way? There's a big difference between
saying a database is embedded IN an application and saying that the
application itself is embedded.

It's as if we had been discussing a lion killing the king, and after a
while of discussing the killer lion, suddenly you mentioned the killer
king. Whom did he kill? is a natural question to ask if this suddenly
comes up. Certainly there's a distinction between the meanings of killeR
king and killeD king though; likewise between embeddING application and
embeddED application.

You asserted that the application would get "screwed up"

Every day, I assert that the sun will rise, the sun will set, and
Windows will get screwed up. Thus far, I have yet to be disappointed.
Thudnerbird got screwed up the other day, somehow. Firefox got screwed
up the week before.

Software gets screwed up. To pretend that it magically won't is to stick
one's head in the sand. And eventually guarantees a most unpleasant
surprise.

Reading the Derby documentation, to which I referred earlier, you
will see that that isn't required with a Derby-embedded application.

End-users of the embedding application won't be using, or interested in,
the Derby documentation, and therefore won't have a clue how to reset,
fix, or whatever the Derby database, short of uninstalling and
reinstalling its host.

Remember, my concern here is with end-user servicing, not what the
application developer can do at design time with the database's docs and
API. When someday some user eventually has a problem, the developer is
highly unlikely to be physically present and have the time to give the
problem his own personal attention, after all. Nor can the developer
anticipate every possible problem at design time and include code to
avoid/recover from it. (If THAT were feasible, software wouldn't tend to
ship with bugs to begin with!)

I never said that I did think that.

No, but you certainly did think it, since otherwise you could not have
just dismissed my concern that it would get corrupted and the end-user
would have no way of coping with it.

Surely you read the post. He wants to "assign each object a unique ID
of some kind".

That's not associations between objects; that's associations of objects
to IDs, and I described exactly how that could be done.

You responded with advice about hash codes and a long,
complex idea for that association involving a custom disk solution.

Because to assign each object a unique ID involves tracking the
already-assigned-an-ID objects, and a hash table is an efficient way of
doing so.

Except that none of that has to do with Derby.

If Derby isn't a database, then facts about databases might not have
anything to do with Derby, I suppose. But I thought you said it was one?

Did you read the Derby documentation yet?

I said, in my previous post, that I didn't have the time to download any
large files and muck about with them "today", and maybe I'd look at them
"tomorrow".

Perhaps I should have been clearer, though, that by "tomorrow" I did not
mean "by 2:00 AM tomorrow", and that by "maybe" I didn't mean to imply
any guarantees.

Derby doesn't require that you repartition a disk.

Oh, I'm sorry. I was continuing to base what I wrote on the belief that
Derby was a DBMS and not some other type of software instead.

No personal jab.

I've seen plenty already.

I've stated that that's not what I'm saying. Why would you read that
into my statements? I only said that the risk was lower than with a
custom disk solution, not that it was zero.

My concern was what a user will be able to do about it when that risk
ever does materialize. Which it will, eventually, if that risk is not
zero. You indicated that the user wouldn't have to be able to do
anything, implying thereby that (in your belief) the risk was actually
zero. Then I disputed that, since it's an outrageous supposition.

It would be the application developer who fixes the problem with a
Derby-embedded system, not the user.

So the user would have to go begging the developer for personal help and
attention anytime the application scrogged itself.

I hope that developer has a lot of spare time, especially if the thing
ever becomes popular and/or some version is released with a
frequently-triggered database-scrogging bug in it.

Mozilla's tendency to occasionally hose its profiles is a nasty nasty
thing. It's hard for the typical user to even find their profile to
delete it. Harder still to follow the instructions on their web site for
preserving and then restoring the bookmarks and saved passwords.

If it had been implemented to use a database, which we assume generously
would have somehow taken the form of a normal Windows file named
profiles.dat, it would be worse. Much worse.

1. Deleting profiles.dat would obviously nuke ALL the profiles in it,
not just one, though on a typical home-computer installation there'd
be just one profile in a profiles.dat in each user application-data
directory.
2. Extracting the bookmarks and saved passwords first would require
querying the database manually somehow, and saving the results.
Probably far beyond the capabilities of Uncle Bob, though moving the
key3.db, signons, and bookmarks files in the actual a-bunch-of-
separate-normal-disk-files implementation is not (quite).
3. Ditto putting the preserved bookmarks and saved passwords into the'
replacement profile after it had been generated.

And it would be the developer who fixes the Derby-embedded application,

Ah, the application developer that makes housecalls. I wish I knew one
of those!

I never said any such thing. I said the application takes care of
things rather than the user.

OK, so you didn't use the word "magically". So sue me.

It's intellectually dishonest.
Projecting?

They could have and should have.

Yeah, right.

How?

Good thing I didn't say that, then.

No, you just implied it. Repeatedly. Only now you've replaced that with
"No problem -- the developer will personally assist you whenever you
need it". In my experience, that never goes with any kind of software
except the kind you only ever can legally get with a five-figure
site-license and maintenance contract.

My suggestion that there are some types of software for which a database
is inappropriate stands.

That's backwards.

That's exactly what you've been saying and implying. If it's backwards,
you got it that way. Though to clarify, the ease of implementation above
is for the *application* developer, who either codes the disk files
solution or uses a database API, *not* the developer that *develops* the
DBMS used with that API.

Again with putting words in my mouth.

How so? Many times I said "but how will the user fix things?" and your
response was basically that they won't have to. Which means nothing will
ever go wrong, barring that miraculous one-in-a-billion
developer-that-makes-housecalls.

the typical cop-out of the one whose point has been disproven. "You
didn't give evidence. Oh, wait, you gave evidence, but I refuse to look
at it."

Since when is "maybe tomorrow" a refusal to do anything?

Making the programmer's job easier reduces the risk of the programmer
making the user's life harder.

That all depends.

I think it would be easier to prevent bugs with a well thought-out,
thoroughly tested and robust system like Derby than with a brand-new
complex custom disk-based solution.

Hence your claim that there'd be fewer bugs in that case. I maintain
that what bugs remained would tend to be harder for the user to recover
from.

On the custom side, there's arguably more bugs, but I maintain they'd be
easier for the user to recover from.

So the question boils down to how many fewer/more bugs, versus how much
harder/easier to recover from.

As I said.

You bet on the solution you think will work better, and I'll bet on the
one I think will work better. Were I to implement a system that
requires association of persistent data, I'd use a database, and that's
how I've gone when faced with that decision. YMMV.

Association of persistent data, not just persisting the data (and giving
each persisted object a single "long"-valued ID).

That was my opinion from the outset. Are you now telling me you've
been violently *agreeing* with me the entire time?

Click to expand...

Had you not been busy putting words in my mouth [rest snipped]

I have done nothing of the sort.

Lew · May 16, 2009

Seamus said:
No, just that your word for it and some web site's isn't proof that it
doesn't.

You are a troll.

Plonk.

Arne Vajhøj · May 18, 2009

Seamus said:
(In particular, 2^64 objects won't fit in RAM in
present-day or near-future computers, so you'll get OOME if you try to
store that many in a HashMap.)

Java uses virtual memory not RAM.

And current implementation of HashMap can not store more
than 2^31 objects because it uses arrays.

Arne

Arne VajhÃ¸j · May 18, 2009

Seamus said:
Heavyweight also involves such factors as code and data size,
configuration headache-inducingness, and complications to deployment.
For example, if the project is a desktop application, can you ask your
users to install a database server? Can they be expected to know how to
fix it if something gets corrupted that persists across reboots?

Derby can be embedded in the app, so it does not need to be installed.

Most users will not be able to fix any type of corruption of binary
data structures on disk.

But the more widely used the persistence software is the smaller
risk of that type of problems.

Arne

Arne Vajhøj · May 18, 2009

Seamus said:
That has its own problems, namely, the user using several such
applications ends up with several copies of the DBMS chewing up disk
space (not just several databases, which was not avoidable, but several
database SERVERS too).

It is a 500 KB jar file. You can have 10 or a 100 of these without
noticing it.

The database will in effect have no user-serviceable parts inside. If
anything gets wacko in it, the typical user's only realistic recourse
will be the uninstall and reinstall the affected application. And then
they lose whatever the database is used to store.

Until something goes wrong.

That is approx. the same for all binary formats.

Arne

Arne VajhÃ¸j · May 18, 2009

Seamus said:
A two-megabyte DBMS? That'll be the day.

That is the day.

You can run the embedded Derby with just the client jar which is
less than 500 KB.

The standalone server is a bit over 2 MB.

Well, except that there are a few differences between the two that you
neglected to address, owing to the differing storage formats.

My idea was basically to use serialized Java objects, probably in
individual files. Likely a problem could be solved, if not by fixing,
then by deleting a particular such file and the app recreating it.

A database problem could be solved using the same technique - deleting
the directory and recreate the data.

A database, on the other hand, typically takes the form of a B-tree
represented who-knows-how and living on its own dedicated disk
partition. It won't be mountable as NTFS or VFAT or whatever, and
probably won't even be visible in Explorer. The installer has to do the
semi-dangerous job of repartitioning the customer's hard drive -- hope
they keep backups.

????

That is not how Derby works.

Indeed there are very few databases that work like this (today).

Arne

Lew · May 18, 2009

A database problem could be solved using the same technique - deleting
the directory and recreate the data.

Seamus MacRae ranted:
Arne VajhÃ¸j nobly attempted:

????

That is not how Derby works.

Indeed there are very few databases that work like this (today).

We tried this information on "Seamus" MacPaul already. He just refused the
evidence and the reasoning:

... your word for it and some web site's isn't proof that it doesn't
[take a larger footprint].

Never mind that the web site was Derby's itself. Apparently he doesn't let
the facts get in the way of his preconceptions.

I applaud the attempt nonetheless.

Arne VajhÃ¸j · May 18, 2009

Seamus said:
Didn't your mother tell you not to believe everything you read on the
web?

You can easily download Derby and verify yourself.

Now that I think about it. If the uses is uptodate with Java, then
the effective footprint by Derby is 0. Because Derby comes with the
JDK.

Well, not in so many words, but it's an implication that follows
naturally from the predictable sequence of events:

1. Programmer uses Derby.
2. Program winds up containing a bug.
3. At a customer deployment, program triggers bug.
4. Database gets b0rked.
5. Customer finds program stopped working properly.
6. Customer finds quitting and restarting it doesn't fix it.
7. Customer calls support...

I'm not sure which of the above you'd argue is implausible. 1 is your
own advice. 2 is pretty much inevitable, like it or not, as is 3 given
that 2 occurred. 4 is dependent on the nature of the bug, but it doesn't
seem implausible. 5 follows from 2. 6 follows from 5, 4, and the
database being nonvolatile. 7 is inevitable given 5 and 6.

Nonsense.

An embedded database can run fine without problems.

Probably the most widely used example is FireFox and
Thunderbird which uses and embedded SQLITE database.

A double digit number of millions of users. And it
just seems to work.

Apparently there's no actual database in this DBMS, then. How interesting.

Apparently there is.

The entire category of embedded databases are characterized by
the fact that there are no installation.

And raw partitions are largely the way of the 1980's.

The risks and headaches involved in repartitioning a hard disk. Haven't
you been paying attention?

I think you have not been paying attention.

Partitioning not needed.

Arne

Jarrick Chagma · May 19, 2009

Arne VajhÃ¸j said:
You can easily download Derby and verify yourself.

Now that I think about it. If the uses is uptodate with Java, then
the effective footprint by Derby is 0. Because Derby comes with the
JDK.

Nonsense.

An embedded database can run fine without problems.

Probably the most widely used example is FireFox and
Thunderbird which uses and embedded SQLITE database.

A double digit number of millions of users. And it
just seems to work.

Apparently there is.

The entire category of embedded databases are characterized by
the fact that there are no installation.

And raw partitions are largely the way of the 1980's.

I think you have not been paying attention.

Partitioning not needed.

I just thought I'd add that we have been using Derby in a production system
for about 3 years and have not had one problem with it. It performs
surprisingly well and the end user doesn't even know it exists. It has
required zero administration.

Seamus MacRae · May 19, 2009

Lew said:
You are a troll.

Excuse me?

Seamus MacRae · May 19, 2009

I said:
How does this change in any way if you replace "Programmer uses Derby"
with "Programmer writes their own storage solution"?

What the customer has to be talked through doing changes.

Disk-files storage means something like "delete the x.foo file and
restart the program".

Database storage means something more like "quit the program, start the
dbhack.exe that came with it, type DELETE * FROM FOOBAR, hit enter ..."

This is utter nonsense. There's no reason for a DBMS to require a
dedicated partition.

Sure there is: to avoid fragmentation and the overhead of going through
a normal filesystem instead of raw access to the disk drive when you're
going to be using your own, typically B-tree based lookup technology anyway.

http://books.google.ca/books?id=0S8...3v3jAw&sa=X&oi=book_result&ct=result&resnum=7
makes reference to dedicated database partitions, naming Oracle and Sun
in the bargain.

Seamus MacRae · May 19, 2009

Arne said:
You can easily download Derby and verify yourself.

"Easily" depending on the speed of your network connection, reliability
of same, available disk space on your machine, and so forth.

Nonsense.

No, not nonsense.

An embedded database can run fine without problems.

Yes, but will it run fine forever, never ever having any problems?
Besides, it was not the database itself I was concerned about, but the
application. Applications tend to contain bugs, and sometimes these tend
to mess up persistent state of various kinds.

Probably the most widely used example is FireFox and
Thunderbird which uses and embedded SQLITE database.

The last time I checked, Thunderbird uses separate small disk files, and
sometimes the user even needs to delete one of these to fix a problem.

A double digit number of millions of users. And it
just seems to work.

Actually, I have had both Firefox and Thunderbird get b0rked in ways
that required a complete reinstall because there was no easy alternative
to fixing it. That meant losing bookmarks, saved passwords, and the like
and having to reinstall addons.

With Thunderbird, I'm not sure what causes it, but eventually some
newsgroups won't refresh, showing thousands of new articles in the list
but none when actually opened, until reinstall. Which loses your
subscribed groups, passwords, and read/unread info. Unsubscribing and
resubscribing just the affected group doesn't fix it.

With Firefox, installing and especially uninstalling or disabling
add-ons is the commonest trigger, and problems take the form of add-ons
being disabled with the options for them being "uninstall" and "disable"
without "enable" and neither of those options working. The add-on is
then unusable until Firefox is nuked and reinstalled.

This is with mostly separate disk files storing various things, rather
than with everything in a single monolithic foo.db file, or, worse, on
its own non-NTFS disk partition. Sometimes you can fix a wonky newsgroup
in Thunderbird by deleting a few particular files, and sometimes you can
move some files out and then, after reinstallation, back in to recover
passwords and bookmarks. Those "sometimes"es would have been "never"s if
they had followed Lew's advice when developing Mozilla.

Apparently there is.

I have already posted a link corroborating my statement that a database
typically involves a separate disk partition.

The entire category of embedded databases are characterized by
the fact that there are no installation.

Whether there is separate installation or not is not relevant to the
question of disk partitioning.

And raw partitions are largely the way of the 1980's.

The book link I gave to Lew is from 1998 and describes some ways to
optimize the use of such partitions.

I think you have not been paying attention.

Think what you want, but thinking untrue things like that will not
accelerate your understanding of the subject matter.

Seamus MacRae · May 19, 2009

Jarrick said:
I just thought I'd add that we have been using Derby in a production
system for about 3 years and have not had one problem with it. It
performs surprisingly well and the end user doesn't even know it
exists. It has required zero administration.

All well and good, until the host application cocks something up and the
user has no recourse but to uninstall and reinstall it to uncock it up,
and furthermore has no way of exporting some of the data (say, saved
passwords) and preserving it through the reinstall, then importing it again.

Seamus MacRae · May 19, 2009

Arne said:
That is the day.

You can run the embedded Derby with just the client jar which is
less than 500 KB.

Running a database client by itself seems rather pointless, however,
since in that state it will only be useful for looking up keys and
getting back "unable to connect to DB server" error messages.

If your point is that the server can be located on the network, that
would seem to be at odds with the usual notion of an "embedded" database.

A database problem could be solved using the same technique

At best only at the granularity of deleting and recreating the whole
database, assuming a user unwilling or not competent to get his hands
dirty with SQL.

For instance, suppose the application stores saved passwords in
password.dat and window positions in appname.ini. At some point the
window positions get b0rked and the app comes up entirely offscreen and
therefore unusable (at least without wizardry). User right clicks
taskbar button, clicks "close", then finds and deletes appname.ini.
Application restarts behaving correctly again and the saved passwords
are still there.

Now suppose this hypothetical application was redesigned, according to
Lew's suggestion, to replace individual disk files with an embedded
database. Now instead of separate password.dat and appname.ini files
there's just an appname.db file the size of a large asteroid or small
moon. When the window positions get b0rked, at best the user can delete
appname.db and the application will recreate it. With no more saved
passwords.

Getting the (hypothetically intact) saved passwords out of appname.db
and saved in some other way, then back into the replacement appname.db
afterward, would require manually querying the database and pasting the
answers somewhere, then using more manual queries to put the set-aside
data back in after the reset.

Using the Windows registry is nearly as bad, since it takes a bit of
know-how to get regedt32.exe to save a bunch of keys out as a .reg
patch, then read them back in again after a reinstall (or whatever), not
to mention to know which registry keys to save in the first place.

Separate, individual disk files are best for simplifying user recovery.
Lew has noted that they are not necessarily best for simplifying program
implementation, or in some other ways.

This leads to my original conclusion: there are tradeoffs here, and
Lew's approach is not a one-size-fits-all best-possible one, contrary to
his (and your) implications.

????

I said, "A database typically takes the form of a B-tree represented
who-knows-how and living on its own dedicated disk partition. It won't
be mountable as NTFS or VFAT or whatever, and probably won't even be
visible in Explorer. The installer has to do the semi-dangerous job of
repartitioning the customer's hard drive -- hope they keep backups."

Seamus MacRae · May 19, 2009

Lew said:
Seamus MacRae ranted:

There is no call for this sort of emotional, issue-clouding edit.
"Seamus MacRae wrote:" will do fine.

Arne VajhÃ¸j nobly attempted:

If you're going to creatively edit attributions, "Arne VajhÃ¸j had wax in
his ears:" would have been a better fit to the situation. Certainly it
would have been more amusing.

We tried this information on "Seamus" MacPaul already. He just refused
the evidence and the reasoning

What evidence and what reasoning? I provided a citation to support my
claim. You've provided what, only bald assertions to support yours? Oh,
yes, and now you've lately added a sprinkling of argumentum ad hominem.
Very impressive, that, I believe you already!

Apparently he doesn't let the facts get in the way of his preconceptions.

I can speak for myself, thank you very much. I'll ask you nicely, once,
to kindly refrain from talking about me in the third person right in
front of me.

Seamus MacRae · May 19, 2009

Arne said:
It is a 500 KB jar file. You can have 10 or a 100 of these without
noticing it.

That depends on who "you" is and on the size of "you"'s computer's free
disk space, I should think. Interesting also how 2MB has magically
become 500KB practically overnight.

That is approx. the same for all binary formats.

Remember the granularity, though: separate disk files can at least be
separately deleted. Deleting just part of a database requires running a
query on that database. Ordinary home-computer users can generally
manage to delete files in Explorer, but will be at a loss when it comes
to figuring out to type DELETE * FROM FOO, or figuring out where to put
it. If there even is anywhere to put it. If the application doesn't come
with a dbhack.exe or similar tool that allows the user to directly
command the database, there will be no place to put it. (The application
itself might theoretically include in its user-interface commands to
reset parts of the database, or that let the user enter SQL queries
manually and submit them to it, but these will be of no help in cases
where the corruption to be fixed has rendered the application
unstartable or otherwise unusable.)

Seamus MacRae · May 19, 2009

Arne said:
Derby can be embedded in the app, so it does not need to be installed.

Not separately, perhaps, but it still gets installed.

Most users will not be able to fix any type of corruption of binary
data structures on disk.

That's why the granularity of those becomes important, so they can
delete one thing without deleting everything else.

But the more widely used the persistence software is the smaller
risk of that type of problems.

The smaller the risk of the persistence API and its implementation
causing problems. However, corruption can result from errors in the host
application too. If it writes invalid values into the database which,
when read back, make the application squawk and die or render it
unusable, the non-database-expert user has no recourse but to delete the
whole database.

Seamus MacRae · May 19, 2009

Arne said:
Java uses virtual memory not RAM.

They won't fit in virtual memory in present-day or near-future computers
either, even leaving aside that using the HashMap heavily would probably
result in paging the whole thing into RAM.

And current implementation of HashMap can not store more
than 2^31 objects

Current implementation could be changed more easily and sooner than
typical hardware memory and disk storage and speed.

How to implement simple DB persistence	9	Aug 15, 2010
weakrefs, threads,, and object ids	1	Jun 14, 2009
Help with passing test	3	Jun 8, 2023
Issues in generating unique time id using virtual memory address	9	Jun 10, 2012
Object Persistence for a MUD	5	Oct 5, 2008
Unique IDs not yielded by INamingContainer	0	Apr 7, 2008
Fatal error: Uncaught Error: Cannot use object of type WP_Error as array in	0	Dec 23, 2021
Object persistence in C	11	Jun 29, 2005

Issues with unique object IDs in persistence

Lew

Lew

Seamus MacRae

Lew

Arne Vajhøj

Arne VajhÃ¸j

Arne Vajhøj

Arne VajhÃ¸j

Lew

Arne VajhÃ¸j

Jarrick Chagma

Seamus MacRae

Seamus MacRae

Seamus MacRae

Seamus MacRae

Seamus MacRae

Seamus MacRae

Seamus MacRae

Seamus MacRae

Seamus MacRae

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads