Bhaskar said:
I want to read the file properties under windows in my Java app.
Under
windows you can right click a file and select properties-summary. You
can see properties like Title, Subject,Author etc for every file under
windows. How can I read those 3 properties using Java ? I tried with
various search words in the internet, but unlucky.
[...]
I used POI (free from apache) to read the summary info from DOC and XLS
files. That means this summary info is stored in the file itself.
But I got this problem when I wanted to read this summary info from PDF
and ZIP files. I opened some PDF,ZIP files in Hex editor but could not
find the summary info.
You can see that it's not /usually/ stored in the file itself by creating an
empty file and setting its properties and summary. That works OK, but the file
size is still zero.
AFAIK, the only correct way to read this data is via MS's COM interfaces. See:
http://msdn.microsoft.com/library/en-us/stg/stg/ipropertysetstorage.asp
What looks to me like a good explanation:
http://www.howtodothings.com/showarticle.asp?article=447
MS also have a sample ActiveX control (with source apparently) that may help:
http://support.microsoft.com/support/kb/articles/Q224/3/51.asp
You would have to use a COM<->Java bridge (of which there are several
available), or program it yourself in C (via JNI) to make use of the above
possibilities. Another way of getting access to the information would be to
write a small app in C which simply took a filename as parameter and wrote the
file's summary info to stdout -- you could then call that from Java.
BTW, the HowToDoThings article mentions that Office applications store the
summary information in-file rather than in the "normal" additional stream(s).
I came across a rather odd way of getting at the data in pure java. Open a
file called:
realFileName + ":\5SummaryInformation"
(where the \5 is the single character with value 5, not the character '\\'
followed by the digit '5'), and you can read the summary data stream. Of
course you will then have to parse the resulting information, and it's in a
private format (at least, I've not heard of it published), so I can't really
recommend this method as either safe or easy ;-)
-- chris