Jar: protocol question

R

Rhino

I've been using the Jar: protocol a bit in the last few days and I'd like to
know if this is a valid use of that protocol:

jar:file:!/Images/foo.gif

Basically, I'm trying to describe the location of a GIF that a program
should be able to find in one of the various jars that are on the classpath
used by the program.

Since 'this.getClass().getResource()' will search EVERY jar in the classpath
for the desired file, it shouldn't be necessary to specify the jar name.
Therefore, it seems to me that this should be valid notation for indicating
that the jar name isn't necessary in this case: the bang ('!') in the name
following the 'file:' suggests to me that the default jar(s), namely all of
the jars found in the classpath, will be searched for an Images directory
and a file named foo.gif within that directory.

Does that seem reasonable? If not, can anyone suggest a better notation to
use for my situation?

I can't find any discussion of this "special case" in the articles I've seen
about the Jar: protocol.


--
Rhino
---
rhino1 AT sympatico DOT ca
"There are two ways of constructing a software design. One way is to make it
so simple that there are obviously no deficiencies. And the other way is to
make it so complicated that there are no obvious deficiencies." - C.A.R.
Hoare
 
B

Ben_

If you need to load the resource from the classpath, then
this.getClass().getResource() will do the trick. What you need to know to
load it is "Images/foo.gif" and the method returns the location of the
resource (wherever it is found first).

If you know in advance there is a risk of collision (the resource exists
multiple times on the classpath and you don't want to load it from the first
location found), then you better change the location to make it unique...

Or, I don't see the point of your question... :)
 
R

Rhino

Ben_ said:
If you need to load the resource from the classpath, then
this.getClass().getResource() will do the trick. What you need to know to
load it is "Images/foo.gif" and the method returns the location of the
resource (wherever it is found first).

If you know in advance there is a risk of collision (the resource exists
multiple times on the classpath and you don't want to load it from the first
location found), then you better change the location to make it unique...

Or, I don't see the point of your question... :)
Yeah, sorry, I probably didn't ask the clearest question in the world....

I understand how this.getClass().getResource() works. My question is really
one of notation more than anything. I think ;-)

One of the articles I saw on the jar: protocol - I can't find it again now,
unfortunately! - contained examples like this:
- jar:file:d:\\myJars\\test.jar!/images/foo.gif (refers to a specific file,
foo.gif, in the path 'images' within a jar called 'test.jar' in the
directory d:\myJars on the local filesystem.
- jar:http://xyz.com/photos/pictures.jar!/pix/baz.jpg (refers to a specific
file, baz.jpg, in the page 'pix', within a jar called 'pictures.jar' in the
directory 'photos' on the website at http://xyz.com.

I'm looking for a notation that I can use in my programs to refer to files
that are within jars that are visible to the program by virtue of being on
the program's classpath. In that case, the name of the jar isn't necessary
since all jars in the classpath will be scanned for the file. That's why I
was thinking that 'jar:file:!/images/foo.png' would be good since it
imitates the first example and implies that the file is in the file system
but that we don't care what jar contains the file.

When I parse a String like that last example, I merely have to start at the
position immediately following the bang and use the rest of the String to
build my URL. Then, the file is found or not found as the case may be.

Does this seem like a reasonable notation to use or is there actually an
"official" way of saying the same thing? The documentation I've seen has no
information on how to denote a file in a jar when you don't care what jar
file contains it.

Rhino
 
B

Ben_

Let's say you want to store an identifier for the resource in a config file.

Then why not simply store "Images/foo.gif" and pass that String directly to
this.getClass().getResource() ?

It will lookup the classpath and load the resource from the file system or
from a jar (or from a URL, if it's a URLClassLoader).

So, I still not fully understand why you really insist on the fact that it
has to be found in a jar file ? It's exactly the point of ClassLoaders to
make you ignorant of where the resource is, provided it is on the classpath.
 
R

Rhino

Ben_ said:
Let's say you want to store an identifier for the resource in a config file.

Then why not simply store "Images/foo.gif" and pass that String directly to
this.getClass().getResource() ?

It will lookup the classpath and load the resource from the file system or
from a jar (or from a URL, if it's a URLClassLoader).

So, I still not fully understand why you really insist on the fact that it
has to be found in a jar file ? It's exactly the point of ClassLoaders to
make you ignorant of where the resource is, provided it is on the classpath.
If I only stored "Images/foo.gif" in config file, someone looking at the
code would have to *INFER* that it was in a jar and was going to be obtained
via this.getClass().getResource(). But what if I wanted to be able to
specify a file that wasn't necessarily in the classpath? The file could be
anywhere, even online or elsewhere in the file system.

"Images/foo.gif" is not going to be found by the program if it is *not* in a
jar on the classpath. In fact, I wouldn't be able to use
this.getClass().getResource() to find it; if it was in a standalone file in
the file system, I'd do this:

File myFile = new File(filename);
URL fileURL = myFile.toURL();

It seems to me that if you clearly and unambiguously notate the origins of a
file, including the protocol name and the jar name (where applicable!) as
well as the path(s) and the specific file name, anyone maintaining the code
is going to be very clear on exactly where that file is. The only wrinkle is
that I'm not aware of a standard way of saying that a file is in a jar on
the classpath and we don't need to know the jar name. That's why I'm
proposing the notation that I mentioned.

However, for all I know, there is already an existing convention to get the
same idea across which differs from mine. If so, I'd like to use the already
established notation, whatever it is, rather than muddying the waters by
using a different notation.

In a sense, I suppose this is basically about making the code both
self-documenting and flexible. Rather than relying on someone to write and
maintain comments in your imaginary config file saying that the file is
found in a jar on the classpath, I'd like to be able to point to any file in
the file system or online (or in a jar that is in the filesystem or online
or in the classpath) and be confident that program would use the appropriate
methods to find the file based on the way the file location was notated.

Am I making sense yet? :)

I'm glad we're having this dialog, it is helping me clarify in my own mind
what I'm trying to accomplish and why it seems useful.

Rhino
 
T

Thomas Fritsch

Rhino said:
I've been using the Jar: protocol a bit in the last few days and I'd like
to know if this is a valid use of that protocol:

jar:file:!/Images/foo.gif
AFAIK the JAR-URL-specification requires the part between "jar:" and "!" to
be a valid URL, i.e. the URL of the jar file. In your case this part is
"file:" which surely is invalid. Hence, in my opinion it would be a mis-use
of the "jar:" protocol.
Basically, I'm trying to describe the location of a GIF that a program
should be able to find in one of the various jars that are on the
classpath
used by the program.

Since 'this.getClass().getResource()' will search EVERY jar in the
classpath
for the desired file, it shouldn't be necessary to specify the jar name.
Therefore, it seems to me that this should be valid notation for
indicating
that the jar name isn't necessary in this case: the bang ('!') in the name
following the 'file:' suggests to me that the default jar(s), namely all
of
the jars found in the classpath, will be searched for an Images directory
and a file named foo.gif within that directory.

Does that seem reasonable? If not, can anyone suggest a better notation to
use for my situation?
Yes, sure. It is a reasonable thing to want.
But I would suggest not to use the "jar:" protocol, and instead prefer to
invent a new
protocol (the name "classpath:" comes to mind). Example URLs might then be:
classpath:/Images/foo.gif
classpath:/javax/swing/plaf/metal/icons/Error.gif

You can push this approach even one step further. If you would write a small
Handler for this new protocol, then you could use your new URLs exactly like
any other URL. For example:
URL url = new URL("classpath:/Images/foo.gif");
InputStream stream = url.openStream();
More on protocol handlers can be found at
http://java.sun.com/j2se/1.4.2/docs/api/java/net/URL.html#constructor_detail
 
R

Roedy Green

Since 'this.getClass().getResource()' will search EVERY jar in the classpath
for the desired file, it shouldn't be necessary to specify the jar name.
Therefore, it seems to me that this should be valid notation for indicating
that the jar name isn't necessary in this case: the bang ('!') in the name
following the 'file:' suggests to me that the default jar(s), namely all of
the jars found in the classpath, will be searched for an Images directory
and a file named foo.gif within that directory.

getResource does a search and gives you a direct url to the Jar where
it found the resource.

If you want to speed that search, use the class-path and index
feature of jar.exe to build a multi-jar index for direct lookup.
 
R

Roedy Green

I'm looking for a notation that I can use in my programs to refer to files
that are within jars that are visible to the program by virtue of being on
the program's classpath. In that case, the name of the jar isn't necessary
since all jars in the classpath will be scanned for the file. That's why I
was thinking that 'jar:file:!/images/foo.png' would be good since it
imitates the first example and implies that the file is in the file system
but that we don't care what jar contains the file.

getResource does searching and gives you the FIRST resource match on
the classpath.

It produces the !syntax which is treated thereafter like any other
URL.

You want a method something like this off the top of my head:

/**
@param s either resource or URL to resource
code either as "myimage.jpg"
or as "jar:file:///C|/bar/baz.jar!/com/foo/myimage.jpg"
@eturn URL to resource
*/
getURLForResource( Class c, String s )
{
if ( s.startsWith( "jar:" )
|| s.startsWith( "file:" )
|| s.startsWith( "http:" ) )
return new URL( s );
else return c.getResource( s );
}
 
R

Roedy Green

If I only stored "Images/foo.gif" in config file, someone looking at the
code would have to *INFER* that it was in a jar and was going to be obtained
via this.getClass().getResource().

Experienced Java programmers all know what a resource is, and are used
to adjusting file locations external to the program to have them still
considered resources in different contexts, e.g. debugging, running
locally, running on a server, running as an Applet.

If you start monkeying with that, you are going to cause 100 times as
much confusion as you imagine you are avoiding.
 
R

Rhino

Thomas Fritsch said:
AFAIK the JAR-URL-specification requires the part between "jar:" and "!" to
be a valid URL, i.e. the URL of the jar file. In your case this part is
"file:" which surely is invalid. Hence, in my opinion it would be a mis-use
of the "jar:" protocol.
Yes, sure. It is a reasonable thing to want.
But I would suggest not to use the "jar:" protocol, and instead prefer to
invent a new
protocol (the name "classpath:" comes to mind). Example URLs might then be:
classpath:/Images/foo.gif
classpath:/javax/swing/plaf/metal/icons/Error.gif

You can push this approach even one step further. If you would write a small
Handler for this new protocol, then you could use your new URLs exactly like
any other URL. For example:
URL url = new URL("classpath:/Images/foo.gif");
InputStream stream = url.openStream();
More on protocol handlers can be found at
http://java.sun.com/j2se/1.4.2/docs/api/java/net/URL.html#constructor_detail

I think I like your proposal! It seemed reasonable to me to continue to use
the 'jar:file:', since it seems to be acceptable to use
'jar:file:c:\\myJars\\big.jar!/Images/foo.gif' to designate a specific entry
within a jar on the filesystem. After all, a jar on the classpath is also on
the filesystem, by definition. But using a new protocol like your proposed
'classpath:' is clearer than trying to use 'file:' within 'jar:'.

The downside is that no 'classpath:' protocol is known to anyone but you and
me, at least as far as i know. That means it may raise more questions than
it solves if anyone else sees my code. Of course, I could approach whatever
body creates RFCs and suggest the creation of a 'classpath:' protocol but
that strikes me as a process that would drag on for years. Still, that is
probably the best solution from a design point of view....
Rhino
 
R

Rhino

Roedy Green said:
getResource does searching and gives you the FIRST resource match on
the classpath.

It produces the !syntax which is treated thereafter like any other
URL.

You want a method something like this off the top of my head:

/**
@param s either resource or URL to resource
code either as "myimage.jpg"
or as "jar:file:///C|/bar/baz.jar!/com/foo/myimage.jpg"
@eturn URL to resource
*/
getURLForResource( Class c, String s )
{
if ( s.startsWith( "jar:" )
|| s.startsWith( "file:" )
|| s.startsWith( "http:" ) )
return new URL( s );
else return c.getResource( s );
}

You might be on to something there! Hmm, let me think that over a bit....

Rhino
 
R

Rhino

Roedy Green said:
Experienced Java programmers all know what a resource is, and are used
to adjusting file locations external to the program to have them still
considered resources in different contexts, e.g. debugging, running
locally, running on a server, running as an Applet.

If you start monkeying with that, you are going to cause 100 times as
much confusion as you imagine you are avoiding.

Causing confusion is the last thing I want. I'm looking to find a way of
describing resources so that it is easy for my code to find those resources
and easy for maintainers of code to understand what I am doing. I'm just
struggling to find a good way to indicate that a given file is in a jar in
the filesystem so I'm looking for suggestions on how to notate the file name
to indicate that.

Rhino
 
R

Roedy Green

I'm just
struggling to find a good way to indicate that a given file is in a jar in
the filesystem so I'm looking for suggestions on how to notate the file name
to indicate that.

using getResource or getResourceAsStream is a good clue. Perhaps you
might want to collect such strings using the term RESOURCE in their
names.

The big problem I have found is remembering to include resources in
jars. You get no hint of trouble till you actually load the resource.

Cramfull offers a solution to that.

See http://mindprod.com/jgloss/cramfull.html
 
C

Chris Smith

Rhino said:
I think I like your proposal! It seemed reasonable to me to continue to use
the 'jar:file:', since it seems to be acceptable to use
'jar:file:c:\\myJars\\big.jar!/Images/foo.gif' to designate a specific entry
within a jar on the filesystem. After all, a jar on the classpath is also on
the filesystem, by definition. But using a new protocol like your proposed
'classpath:' is clearer than trying to use 'file:' within 'jar:'.

The downside is that no 'classpath:' protocol is known to anyone but you and
me, at least as far as i know. That means it may raise more questions than
it solves if anyone else sees my code.

Believe me, seeing a malformed "jar:" URL would raise at LEAST as many
questions as seeing a new protocol handler. Besides, your new "URL"
could not be used properly with the java.net.URL class. So your choice
is between something that's unique, or something that's fundamentally
broken. I'll take unique any day.
Of course, I could approach whatever
body creates RFCs and suggest the creation of a 'classpath:' protocol but
that strikes me as a process that would drag on for years.

It's also quite unlikely to succeed. You need to realize that the Java
concept of a URL is very different from the W3C's concept of a URL.
Things like the "jar:" scheme are Java URLs. Though they comply with
the syntax of the W3C's URLs, the W3C doesn't standardize any scheme
called "jar".

So if you did suggest this, use the Bug Parade at java.sun.com. The
best way to proceed would be to implement a ProtocolHandler as described
above, and then propose it as an RFE in Bug Parade, and attach your
existing code. And yes, it would definitely drag on for years; but at
least you've got your own code to use in the interim.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
T

Thomas Fritsch

Rhino said:
I think I like your proposal! It seemed reasonable to me to continue to
use
the 'jar:file:', since it seems to be acceptable to use
'jar:file:c:\\myJars\\big.jar!/Images/foo.gif' to designate a specific
entry
within a jar on the filesystem. After all, a jar on the classpath is also
on
the filesystem, by definition. But using a new protocol like your proposed
'classpath:' is clearer than trying to use 'file:' within 'jar:'.

The downside is that no 'classpath:' protocol is known to anyone but you
and
me, at least as far as i know. That means it may raise more questions than
it solves if anyone else sees my code. Of course, I could approach
whatever
body creates RFCs and suggest the creation of a 'classpath:' protocol but
that strikes me as a process that would drag on for years. Still, that is
probably the best solution from a design point of view....

I agree, going through the official RFC process would be lengthy. But I
think that this would not be needed for your goal. From your original post I
understand your goal as: Use specialized URLs internal to your application,
but *not* for inter-operating with *other* applications.

The "classpath" protocol is not a real protocol in the same sense like the
networking protocols ("http", "ftp", ...) are. Actually it is little more
than a parsing rule how to interpret URL-strings ("classpath:resourceName").
Hence, the term /protocol/ might be very misleading here.
BTW: As far as I know, even Sun's "jar" protocol has never been officially
registered by an RFC, probably for the same reason as above.

You might be surprised how simple the "classpath" protocol handler can be
implemented (~12 lines of code). See
http://gate.ac.uk/gate/doc/java2html/gate/util/protocols/classpath/Handler.java.html
for an inspiration.
Making Java aware of the new protocol handler is simple, too:
Add class "your.package.classpath.Handler" to your app, and start your app
with "-Djava.protocol.handler.pkgs=your.package".
Java will then automagically find the Handler when needed.
 
R

Roedy Green

So if you did suggest this, use the Bug Parade at java.sun.com. The
best way to proceed would be to implement a ProtocolHandler as described
above,

Would anyone care to compose a paragraph outlining what you have to do
to add a new protocol handler? I would like to add this to the java
glossary.

What Interface(s) do you have to implement?
What do you do to get it registered on the list so that your code will
get called when someone does.

URL url = new URL( "weird://www.billabong.com:80/songs/lyrics.txt" );
URLConnection urlc = url.openConnection();

I know this is possible because a team I worked on added a number of
custom protocol handlers to deal with various stock market ticker
streams.
 
R

Rhino

Thomas Fritsch said:
I agree, going through the official RFC process would be lengthy. But I
think that this would not be needed for your goal. From your original post I
understand your goal as: Use specialized URLs internal to your application,
but *not* for inter-operating with *other* applications.
Yes, I think this is the heart of the issue. I really only _need_ a
convention/notation/protocol for use within my own programs; however, I'm
_hoping_ to find something that other people would understand fairly
intuitively if they tried to maintain my code. The ideal would be to come up
with a standardized way of saying "the file is in some jar of the classpath
but we don't need it's name" that the whole industry would understand and
accept.

Even better, an overall convention on describing the position of ANY piece
of information would be wonderful. It would be really neat to see a concise,
descriptive way of noting the location of a piece of data, even if was in a
database or on a network share or a floppy disk in someone's house. Then,
just hand that location to a blackbox method that will return the data in
the file, assuming the "data source" is actually available (e.g. the floppy
disk is actually in someone's drive) and there are no security roadblocks
(e.g. there are no file permissions or access issues for that particular
piece of data).

The "classpath" protocol is not a real protocol in the same sense like the
networking protocols ("http", "ftp", ...) are. Actually it is little more
than a parsing rule how to interpret URL-strings ("classpath:resourceName").
Hence, the term /protocol/ might be very misleading here.
BTW: As far as I know, even Sun's "jar" protocol has never been officially
registered by an RFC, probably for the same reason as above.
Yes, I agree, "protocol" may not be the right word for this; but if there is
an existing word that is more suitable, I haven't thought of what it is.
Maybe this needs a new noun - maybe a "Fritsch" or a "Rhino" - but let's put
that aside for now :)
You might be surprised how simple the "classpath" protocol handler can be
implemented (~12 lines of code). See
http://gate.ac.uk/gate/doc/java2html/gate/util/protocols/classpath/Handler.java.html
for an inspiration.
Making Java aware of the new protocol handler is simple, too:
Add class "your.package.classpath.Handler" to your app, and start your app
with "-Djava.protocol.handler.pkgs=your.package".
Java will then automagically find the Handler when needed.
Damn! I'm starting to like this idea a lot! This approach would make it
pretty easy for others to adopt this new "protocol" too. Eventually, it
could become part of the API itself so that it wouldn't have to be loaded
separately.

I'm going to mull this over a bit; I may just decide to take this further.
But I need to think about what my own application needs first before I get
too caught up in this :)

Thanks for your very interesting suggestions!

Rhino
 
R

Roedy Green


In a nutshell, here is how it works:

In URLs, you see officially supported protocols like http: https:
file: ftp: jdbc: and rmi:. These describe the rules by which data are
extracted over such links. It is possible to define your own
protocols, e.g. to extract stock ticker information or to handle
encryption or compression. To do that you implement a custom java.net.
URLStreamHandler class and a java.net. URLConnection.

You then name your new URLStreamHandler class com.mydomain.protocol.
xxxx.Handler where xxxx is the name of your new protocol.

Then you must hook them into the official list of supported protocols
so that new URL will recognise your new protocol rather than throwing
a MalformedURLException. You do this by adding your implementing
package name prefix e.g. com.mydomain.protocol to the
java.protocol.handler.pkgs system property, e.g.

// Registering a new protocol handler
com.mydomain.protocol.xxxx.Handler

System.setSystemProperty( "java.protocol.handler.pkgs",
"com.mydomain.protocol" );

You don't hook your protocol name itself in anywhere. Java finds it
via the package/class naming convention.

Normally there is no such java.protocol.handler.pkgs property because
there are no custom protocols. If you have more than one package
prefix, use | to separate the names, not commas.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top