Thomas said:
Running out of memory is not a crime. NetBeans likes to do it
frequently. If I do run out of memory, I don't want my work to be
trashed because the programmer was clueless. If I'm in a hurry, I might
even want to carry on. Particularly if it only OOMEd because I tried to
load a core into a text editor. There is no excuse for not behaving
gracefully.
Even if you tried to load a core into a text editor it wouldn't die in
the BufferedInputStream constructor from having tried to allocate a
buffer there the size of the whole file. It would die growing the
StringBuffer underlying the edit box or it would read the file size and
die allocating a huge char array up-front.
But in my experience, OOMEs are almost always thrown while allocating
the straw that breaks the camel's back, rather than something with the
mass of a small moon. Certainly, failure to allocate a 2KB buffer in
the BufferedInputStream constructor is going to be an example of the
former, and if that's the case, it was already in a "destined to blow
up in your face in the next five minutes" state before it reached that
line of code.
Of Java programs, rather than Java itself. Often down to not taking care
over things like exception-safety.
Not my code, which is chock fill of try ... finally.
Even IdentityHashMap, which doesn't use Entry in its implementation
(it's a probing hash map, using pairs of array slots for key and value),
doesn't churn Entries.
Still churns keys and values.
Well, you can do three things with a Web page. Parse it on the fly as
it streams in, outputting a baroque data structure to render. Result: a
baroque data structure with enough overhead to be twice the size of the
String. Or, read it into a huge String (by way of a huge StringBuffer).
Or, read it, parse it on the fly, and discard the bulk of what you get.
Possibly by building it into a bitmap of some sort, which will be
bigger than the String and a lot less useful, or into a vector graphics
representation, in which case see the first option (baroque data
structure) only without the neat separation of content from
presentation.
Only the String or data structure enable the result to be
text-searched. Only those are likely to be sensible unless you render a
thumbnail of the page for some kind of archival purpose or are
stripping it for a single important element (the favicon.ico link, say)
or mining it for a particular thing (a price in a predictable place on
the page -- you can parse on the fly and discard nearly everything for
that kind of use, but then Incredibill will come after you with
Crawlwall, handcuffs, and a rubber phallus, slam you face-first into
the former, and then <censored>).
Which you use depends on what you're doing. If you're just downloading
the page or precaching it or something you may as well use a compact
String until it's needed. If you're rendering it, you may want the
whole javax.swing.text.html nine yards, or perhaps <insert the name of
the XML Package of the Week(tm) here>.
And so you should be more careful about behaving gracefully.
Such as? One particular app I'm thinking of does relatively little disk
I/O and doesn't use the construct that bothered you so much for any of
that. It has cleanup finally clauses that may actually constitute the
majority of the LOC, after of course ignoring blank lines, comments,
and javadoc. These do, in fact, clean up any open FooStreams.
The persistent, growable data body involves mostly HashFoos with quite
a lot of entry churn but no long-term growth trend (most are programmed
to discard older entries when they exceed a certain size -- this
produces better behaved caches than SoftReference on current JVMs IME).
Debugging doesn't show any accumulation of deadwood either (in the form
of stale threads left in infinite loops or whatever other form it might
take). So, unless the JVM is doing something really dumb, like not
cleaning up non-strongly-reachable circular data structures ...