How does one get the title out of a JPEG using ImageIO?

B

Ben Phillips

How does one get the title out of a JPEG using ImageIO?

IIOMetadata looks like it's where image metadata winds up, and I have
determined how to get two IIOMetadata objects for a JPEG image (the
stream metadata and the image metadata), but the javadocs for
IIOMetadata seem to have been written by Lewis Carroll -- I've gone down
the rabbit hole here.

There is nowhere in the docs, or in the Java Tutorial, or apparently on
the Web at all where a simple tutorial with instructive examples exists
that indicates how to get, say, "Title" (or "Date Picture Taken" or
whatnot) information from an IIOMetadata. Perusing what docs there is
suggests no method more efficient than traversing some kind of a tree,
looking for something with say "title" in it somewhere, and praying. :/

Google searches I tried include:
java imageio "extract title metadata" -> no results at all(!)
java imageio "title metadata" -> three useless results

This seems to be an under-documented part of the Java APIs, and
under-explained by the Java Tutorial.

Does anyone here know a simple way to get title and other information
for an image? Something like a "properties" object or a Map, with
documentation listing the most commonly present keys, would have been
nice...
 
M

Mark Space

Ben said:
How does one get the title out of a JPEG using ImageIO?

IIOMetadata looks like it's where image metadata winds up, and I have
determined how to get two IIOMetadata objects for a JPEG image (the
stream metadata and the image metadata), but the javadocs for
IIOMetadata seem to have been written by Lewis Carroll -- I've gone down
the rabbit hole here.

Dunno if this will help -- I just found this myself. It appears to dump
all properties associated with an image file though, so you can inspect
the image yourself and understand if it has a "title" property that want.

<http://johnbokma.com/java/obtaining-image-metadata.html>
 
J

John B. Matthews

Ben Phillips said:
How does one get the title out of a JPEG using ImageIO?

IIOMetadata looks like it's where image metadata winds up, and I have
determined how to get two IIOMetadata objects for a JPEG image (the
stream metadata and the image metadata), but the javadocs for
IIOMetadata seem to have been written by Lewis Carroll -- I've gone down
the rabbit hole here.

There is nowhere in the docs, or in the Java Tutorial, or apparently on
the Web at all where a simple tutorial with instructive examples exists
that indicates how to get, say, "Title" (or "Date Picture Taken" or
whatnot) information from an IIOMetadata. Perusing what docs there is
suggests no method more efficient than traversing some kind of a tree,
looking for something with say "title" in it somewhere, and praying. :/

Google searches I tried include:
java imageio "extract title metadata" -> no results at all(!)
java imageio "title metadata" -> three useless results

This seems to be an under-documented part of the Java APIs, and
under-explained by the Java Tutorial.

Does anyone here know a simple way to get title and other information
for an image? Something like a "properties" object or a Map, with
documentation listing the most commonly present keys, would have been
nice...

Here's the standard metadata:

<http://java.sun.com/javase/6/docs/api/javax/imageio/metadata/doc-files/s
tandard_metadata.html>

And, for example, the JPEG metatdata:

<http://java.sun.com/javase/6/docs/api/javax/imageio/metadata/doc-files/j
peg_metadata.html>

But you may want the custom metadata, such as EXIF:

<http://www.drewnoakes.com/code/exif/>
 
M

Mark Space

John said:
Here's the standard metadata:

<http://java.sun.com/javase/6/docs/api/javax/imageio/metadata/doc-files/s
tandard_metadata.html>

And, for example, the JPEG metatdata:

<http://java.sun.com/javase/6/docs/api/javax/imageio/metadata/doc-files/j
peg_metadata.html>

John those links are broken for me. I think a newline got inserted in
the middle of each one. Here's a link for the OP that points I think to
the parent of those pages, scroll down to find links to the various doc
files. The links are in the last paragraph.

<http://java.sun.com/javase/6/docs/api/javax/imageio/package-summary.html>


Also, for the OP, I tried that link I posted earlier. I tested it with
an image I had laying around on disk that I knew had a "comment"
attribute. The Javax stuff found that comment just fine, so it's
working that far at least.

Here's the shorter one. The comment is "Created with GIMP" in the last
entry.

Format name: javax_imageio_1.0
<javax_imageio_1.0>
<Chroma>
<ColorSpaceType name="YCbCr"/>
<NumChannels value="3"/>
</Chroma>
<Compression>
<CompressionTypeName value="JPEG"/>
<Lossless value="false"/>
<NumProgressiveScans value="1"/>
</Compression>
<Dimension>
<PixelAspectRatio value="1.0"/>
<ImageOrientation value="normal"/>
<HorizontalPixelSize value="0.35277778"/>
<VerticalPixelSize value="0.35277778"/>
</Dimension>
<Text>
<TextEntry keyword="comment" value="Created with GIMP"/>
</Text>
</javax_imageio_1.0>
 
J

John B. Matthews

Mark Space said:
John B. Matthews wrote: [...]
John those links are broken for me. I think a newline got inserted in
the middle of each one. Here's a link for the OP that points I think to
the parent of those pages, scroll down to find links to the various doc
files. The links are in the last paragraph.

<http://java.sun.com/javase/6/docs/api/javax/imageio/package-summary.html>

Ah, that's better. Thanks!

I also found this EXIF viewer written in Java:

<http://sourceforge.net/projects/jexifviewer>
 
B

Ben Phillips

Mark said:
Dunno if this will help -- I just found this myself. It appears to dump
all properties associated with an image file though, so you can inspect
the image yourself and understand if it has a "title" property that want.

<http://johnbokma.com/java/obtaining-image-metadata.html>

Gah.

So I went and actually implemented a tree-traversal and also a flattener
(creates a key-value mapping of strings and seems to work well).

It turns out that on a particular image for which Windows Explorer shows
Date Picture Taken, Camera Model, and similar metadata, the IIOMetadata
object contains none of this -- just a lot of stuff about the encoding,
e.g. ColorSpaceType: YCbCr and Lossless: false and ImageOrientation: normal.

Where the heck is EXIF metadata and the like going, if not into IIOMetadata?

On a side note, I've got a quandary with displaying image thumbnails
from disk directories in a JList.

Approach #1: pregenerate thumbnails and store them. List model gets
populated with references to objects of some sort that in turn reference
a thumbnail among other things.

Upside: List control will be responsive.

Downsides: Will have to "think" for a while whenever the directory
changes, and will scale very poorly to large enough directories, both
"thinking" for ages when switching to one AND gobbling up RAM with
thumbnails. Hundreds of 64x64 RGB thumbnails at 3 bytes per pixel
becomes 12288 bytes plus some overhead per thumbnail and several megs of
thumbnail image data total. Touching a directory with tens of thousands
of images would immediately crash the JVM with OOME on any likely
configuration, because then the thumbnail data bloats up into several
hundred megs.


Approach #2: load thumbnails on the fly as needed for the list to display

Upside: Scaling problems go away

Downside: will have to implement a fairly hairy custom cell renderer for
JList. This cell renderer will have to do I/O and contend with the
possibility of IOException. It will probably have to junk exceptions and
present some placeholder for images that won't load. It will also be
slow. I doubt I can vouch for the JList's performance and responsiveness
in this case.


Approach #3: somehow have the list scroll immediately, but cell
thumbnails and some metadata in the label text render only as the data
becomes available

Upside: It might solve both scaling and responsiveness problems

Downsite: an even hairier custom cell renderer, plus concurrency issues.
I'm not even 100% sure how to do this. Maybe spawn a SwingWorker to load
the image and related data and then render a blank or a "loading..."
indicator icon and just the file name text, then schedule the list
control to repaint itself after some number of milliseconds. If the cell
renderer is called for a cell already being loaded in the background it
checks to see if it's done and renders properly if it is, and if it's
not done loading it repaints with the same "loading..." behavior as
before but does not spawn a second SwingWorker.

This means messing with concurrency (my current test app does everything
on the EDT, except for a single "SwingUtilities.invokeAndWait(...)" in
main(), and I haven't therefore had to worry about thread safety.


I don't suppose there is an Approach #4 for this type of thing? Or any
known API for accessing Title, Picture Taken On, and similar EXIF-type
metadata in images?


And one gotcha I discovered when testing stuff (using Approach #1): I
got OOME loading a smallish directory of maybe a few dozen images. I
found on memory profiling that some very large byte and int arrays were
sitting around and not getting collected. It seems that
Image.getScaledInstance is producing thumbnails that hang onto an
internal reference to the full size image. If a few dozen full size
images "leaking" causes OOME, a few thousand thumbnails will cause OOME,
so Approach #1 doesn't look long-term viable to me.

As it is, it complicated my thumbnail generation -- now I
getScaledInstance and then blit it onto a BufferedImage's createGraphics
to make a thumbnail that doesn't hang onto a reference to the full-size
image, letting the latter go away until it's needed again whereupon it's
fetched from disk again.

Approach #4 (just occurred to me): maintain a thumbnail database on disk
and reference that from the cell renderer. Still does disk I/O but only
on a teensy file, and no image rescaling on the EDT.

Upside: Seems to solve most performance and all memory-use issues.

Downside: JList might still be slightly slow, particularly scrolling
down (scrolling back up could use an in-memory MRU cache of some sort),
and the need to maintain some kind of on-disk database of thumbnails,
cope with it if it gets scrogged somehow, and so forth. Cell renderer
still needs to do I/O, too.

Approach #5: detect large directories and put A-Z spinner in UI and only
put in JList items beginning with the currently-selected letter.

Upside: Reduces scaling issues even if we keep thumbnails in RAM.
Simplifies cell renderer and the like.

Downside: Complicates UI. A directory might have thousands of images
beginning with A, or images beginning with non-alphabetic characters. A
spinner with a full ASCII table would be unwieldy. A non-spinner control
would eat screen real-estate. Viewing "virtual subdirectories" of this
sort may be unintuitive, reducing usability of the UI. Etc.

Anyone got any ideas for Approach #6? Or recommendations among 1 thru 5? :)

Also, anyone got any ideas for accessing EXIF data? Google turns up some
non-free third party libraries for JPEG EXIF data, but nothing about
accessing this stuff with free software (GPL/LGPL libraries) or only
built-in Java classes.

The best hint I've got so far is that the IIOMetadata "unknown: 225" I'm
seeing for the JPEGs with EXIF data here has something to do with it.
 
B

Ben Phillips

Ben said:
Whoa. Found something:

http://incubator.apache.org/sanselan/site/index.html

Anyone have any warnings/praise/comments?

Nevermind.

http://easyproblemsolutions.blogspot.com/2008/06/java-and-jpeg-metadata.html

Installing the JAI TIFF reader results in EXIF metadata appearing for
JPEG as well as TIFF support.

Downside: just installing it also seems to have slowed down the GUI in
my test app somehow. I rigged a JSplitPane with a JList on the left and
a JLabel on the right, both in JScrollPanes, with the latter used to
show the image and an HTMLized version of the metadata. Just scrolling
in the latter scroll pane, with just a single JLabel, and not doing
anything that triggers image I/O, is slow, and the larger the directory
the slower it is. This is very strange since scrolling the JLabel around
doesn't have any connection with directory size. The JVM process size
doesn't seem bigger than before either, so it doesn't seem like JAI is
somehow screwing things up by leaking either.

Nonetheless, I am going to proceed. I'm going ahead with Approach #4, as
well; I tested the thing on a directory full of thumbnails (small enough
to make my own code's thumbnail generator not do any scaling) and the
generation time was quick enough (tens of thumbnails a second) to make
me think that a JList loading thumbnails from disk on the fly in its
cell renderer won't be unacceptably slow.

Now to learn more about implementing custom cell renderers. Can a JLabel
be given a few pixels' border at the left before the label icon, when
the text is on the right?
 
B

Ben Phillips

John said:
I didn't see that one, but this one was interesting and it's GPL:

<http://sourceforge.net/projects/jexifviewer>

Oops. That sounds like an application, not a library, and the web site
confirms:

"Intended Audience : End Users/Desktop"

I'm in search of a library, but apparently Sun's own JAI TIFF plugin for
ImageIO does the job, in combination with IIOMetadata, without even
needing to add any imports or directly use any additional classes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top