Enum enlightenment

Discussion in 'Java' started by Roedy Green, Jul 8, 2005.

  1. Roedy Green

    Roedy Green Guest

    I wrote a simple enum-using class and decompiled it. Now all sorts of
    things about enum make sense.

    to understand this paste this into documents and view them side by
    side in your IDE.


    Here is the original code -- a enum to track the various flavours of
    Windows:

    package com.mindprod.htmlmacros;

    import java.util.EnumSet;
    import java.util.Set;

    /**
    * enum of possible Windows OSes. May be used freely for any purpose
    but military.
    * @author Roedy Green copyright 2005 Canadian Mind Products
    */
    public enum WindowsOS {

    WIN95( "W95", "Windows 95"),
    WIN98( "W98", "Windows 98"),
    WINME( "Me", "Windows Me"),
    WINNT( "NT", "Windows NT" ),
    WIN2K( "W2K", "Windows 2000" ),
    WINXP( "XP", "Windows XP" ),
    WIN2K3("W2K3","Windows 2003");

    private String shortName;

    private String longName;

    private static boolean DEBUGGING = true;

    /**
    * Enum constant constructor that captures two extra facts about
    the enum.
    * @param short name for the os e.g. Me
    * @param long name of the OS e.g. "Windows XP"
    */
    WindowsOS ( String shortName, String longName )
    {
    this.shortName = shortName;
    this.longName = longName;
    }

    /**
    * @return short name
    */
    public String getShortName ()
    {
    return this.shortName;
    }

    /**
    * @return long name
    */
    public String getLongName ()
    {
    return this.longName;
    }

    /**
    * Static method to construct a string mentioning multiple OSes,
    * by slashes.
    * @param choices, EnumSet of just the oses you want included
    * @return a String of the form "Windows 95/98/Me"
    */
    public static String OSes( EnumSet<WindowsOS> choices )
    {
    StringBuilder sb = new StringBuilder( 40 );
    for ( WindowsOS o : choices )
    {
    sb.append( '/' );
    sb.append( o.shortName );
    }
    if ( sb.length() == 0 )
    {
    return "";
    }
    else
    {
    // chop lead / and prepend "windows "
    return "Windows " + sb.toString().substring( 1 );
    }
    }

    /**
    * test harness
    *
    * @param args not used
    */
    public static void main ( String[] args )
    {
    if ( DEBUGGING )
    {
    // You don't use a constructor to create EnumSet objects.
    EnumSet<WindowsOS> justThese = EnumSet.of( WIN2K, WINXP,
    WINME );

    // prints "Windows Me/W2K/XP"
    // note they come out in proper order.
    System.out.println( WindowsOS.OSes ( justThese ) );
    }
    }
    }



    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


    Here is the decomiled code, showing what actually makes it into byte
    code.


    package com.mindprod.htmlmacros;

    import java.io.PrintStream;
    import java.util.EnumSet;
    import java.util.Iterator;

    public final class WindowsOS extends Enum
    {

    public static final WindowsOS[] values()
    {
    return (WindowsOS[])$VALUES.clone();
    }

    public static WindowsOS valueOf(String s)
    {
    return
    (WindowsOS)Enum.valueOf(com/mindprod/htmlmacros/WindowsOS, s);
    }

    private WindowsOS(String s, int i, String s1, String s2)
    {
    super(s, i);
    shortName = s1;
    longName = s2;
    }

    public String getShortName()
    {
    return shortName;
    }

    public String getLongName()
    {
    return longName;
    }

    public static String OSes(EnumSet enumset)
    {
    StringBuilder stringbuilder = new StringBuilder(40);
    WindowsOS windowsos;
    for(Iterator iterator = enumset.iterator();
    iterator.hasNext(); stringbuilder.append(windowsos.shortName))
    {
    windowsos = (WindowsOS)iterator.next();
    stringbuilder.append('/');
    }

    if(stringbuilder.length() == 0)
    return "";
    else
    return (new StringBuilder()).append("Windows
    ").append(stringbuilder.toString().substring(1)).toString();
    }

    public static void main(String args[])
    {
    if(DEBUGGING)
    {
    EnumSet enumset = EnumSet.of(WIN2K, WINXP, WINME);
    System.out.println(OSes(enumset));
    }
    }

    public static final WindowsOS WIN95;
    public static final WindowsOS WIN98;
    public static final WindowsOS WINME;
    public static final WindowsOS WINNT;
    public static final WindowsOS WIN2K;
    public static final WindowsOS WINXP;
    public static final WindowsOS WIN2K3;
    private String shortName;
    private String longName;
    private static boolean DEBUGGING = true;
    private static final WindowsOS $VALUES[];

    static
    {
    WIN95 = new WindowsOS("WIN95", 0, "W95", "Windows 95");
    WIN98 = new WindowsOS("WIN98", 1, "W98", "Windows 98");
    WINME = new WindowsOS("WINME", 2, "Me", "Windows Me");
    WINNT = new WindowsOS("WINNT", 3, "NT", "Windows NT");
    WIN2K = new WindowsOS("WIN2K", 4, "W2K", "Windows 2000");
    WINXP = new WindowsOS("WINXP", 5, "XP", "Windows XP");
    WIN2K3 = new WindowsOS("WIN2K3", 6, "W2K3", "Windows 2003");
    $VALUES = (new WindowsOS[] {
    WIN95, WIN98, WINME, WINNT, WIN2K, WINXP, WIN2K3
    });
    }
    }



    Note how java generates you some methods in the same class!

    It composes you a values and a valueOf method that does not need a
    Class parameter.

    it makes your constructor private.

    I generates two extra secret fields to your constructor, the enum name
    and the ordinal. This mean the enum constants don't have to count
    themselves or register themselves. That is all done at compile time.

    It creates static finals for each enum constant an the code to
    initialise them using your constructors.

    It creates a constant array of enum objects, one of each flavour
    called $VALUES[] to use in the values method. IT can also be used by
    the name method to convert

    In this case no enum constant had any of its own fields or methods.

    Note the true enum class is hard coded in all over the place. This is
    no object-type erasure crap.

    The $VALUE array could have been used by methods like
    first, last, count, ordinalToEnum, but I have not found any trace of
    such methods. You can't get at the $VALUES without patching byte code
    since that is not a legal java identifier. So I guess every time you
    wan that information you need to do a values() to clone the array just
    to find out how long it is, or to index it in a read only way to
    convert ordinal back to enum.

    Note that the generic EnumSet handles all enums. There is no
    corresponding customised code generated for the EnumSet.

    It is not obvious from this code, but the bit masks used in EnumSet
    computations are not built into the enum constants. They are generated
    from the ordinal number as needed on the fly with shifting and
    masking.

    It is also not obvious from this code, but EnumSet.of figures out the
    class of the enums by looking up the class of the first parameter.
    There is NOT an EnumSet class generated for each Enum class.

    --
    Bush crime family lost/embezzled $3 trillion from Pentagon.
    Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
    http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

    Canadian Mind Products, Roedy Green.
    See http://mindprod.com/iraq.html photos of Bush's war crimes
     
    Roedy Green, Jul 8, 2005
    #1
    1. Advertising

  2. Roedy Green

    Roedy Green Guest

    Here is what happens when you give your enum constants their own
    private methods and variables:


    enum
    .....
    WIN2K( "W2K", "Windows 2000" )
    {
    private int p;
    int cost ()
    {
    return 200;
    }
    } ,
    WINXP( "XP", "Windows XP" )
    {
    private int q;
    int cost ()
    {
    return 300;
    }
    } ,
    WIN2K3("W2K3","Windows 2003");
    ....

    this generates:


    WINNT = new WindowsOS("WINNT", 3, "NT", "Windows NT");
    WIN2K = new WindowsOS("WIN2K", 4, "W2K", "Windows 2000") {

    int cost()
    {
    return 200;
    }

    private int p;

    };
    WINXP = new WindowsOS("WINXP", 5, "XP", "Windows XP") {

    int cost()
    {
    return 300;
    }

    private int q;

    };
    WIN2K3 = new WindowsOS("WIN2K3", 6, "W2K3", "Windows 2003");


    in other words, each of those little enum constants becomes its own
    little anyonyomous inner class.

    --
    Bush crime family lost/embezzled $3 trillion from Pentagon.
    Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
    http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

    Canadian Mind Products, Roedy Green.
    See http://mindprod.com/iraq.html photos of Bush's war crimes
     
    Roedy Green, Jul 8, 2005
    #2
    1. Advertising

  3. Roedy Green wrote:
    > I wrote a simple enum-using class and decompiled it. <snip>


    Tell me Roedy, how do you decompile a .class file? javap gives me an overview of
    the methods in the class, not the code within the methods. The switches I tried
    (-c, -h, -l, -p, -s, -v) did not give me a 'machine formatted' version of my
    ..java files.
     
    Martijn Mulder, Jul 8, 2005
    #3
  4. Roedy Green

    Paul Guest

    Roedy Green wrote:
    >
    > The $VALUE array could have been used by methods like
    > first, last, count, ordinalToEnum, but I have not found any trace of
    > such methods. You can't get at the $VALUES without patching byte code
    > since that is not a legal java identifier. So I guess every time you
    > wan that information you need to do a values() to clone the array just
    > to find out how long it is, or to index it in a read only way to
    > convert ordinal back to enum.
    >


    In Java, you can use a dollar sign as part of a legal Java identifier. I
    think the $VALUES array isn't created until some second pass after the
    compiler has validated your actual .java file but before it translates
    it into the implementation behind the idiom.

    Maybe "legal java identifier" isn't what you meant, but that the symbol
    is undefined.

    public enum TestDollar
    {
    ONE, TWO;

    private int $dollarvar;

    public int get() { return $dollarvar; }
    public void set(int d) { $dollarvar = d; }

    public void voidfunc()
    {
    TestDollar[] vals = $VALUES;
    // compiler says 'cannot find symbol' for $VALUES
    }
    }

    --Paul
     
    Paul, Jul 8, 2005
    #4
  5. Roedy Green

    Roedy Green Guest

    On Fri, 08 Jul 2005 13:17:38 -0500, Paul <> wrote or
    quoted :

    >Maybe "legal java identifier" isn't what you meant, but that the symbol
    >is undefined.


    I scanned my text books and the web and could not get a definitive
    answer on just what chars are allowed in identifiers:
    1. in JVM byte code.
    2. in java source.

    I wanted not just to know what the current compiler lets you have, but
    what the language standard guarantees.

    I suppose it can be tested by experiment. is eacute ok? Chinese
    characters? math symbols? the \u notation is pretty ugly. I'd need a
    unicode text editor to do the proper experiments.

    my personal rule has been to use nothing but A-Z a-z 0-9 and _ but
    only the middle of constant names.

    A similar question is just how long can an Identifier be? Natural
    limits due to bit sizes for field lengths are 31, 255 and 32,767. I
    suppose that could be an implementation detail.


    --
    Bush crime family lost/embezzled $3 trillion from Pentagon.
    Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
    http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

    Canadian Mind Products, Roedy Green.
    See http://mindprod.com/iraq.html photos of Bush's war crimes
     
    Roedy Green, Jul 9, 2005
    #5
  6. Roedy Green

    Roedy Green Guest

    On Fri, 8 Jul 2005 17:14:39 +0200, "Martijn Mulder" <i@m> wrote or
    quoted :

    >Tell me Roedy, how do you decompile a .class file? javap gives me an overview of
    >the methods in the class, not the code within the methods. The switches I tried
    >(-c, -h, -l, -p, -s, -v) did not give me a 'machine formatted' version of my
    >.java files.


    see http://mindprod.com/jgloss/decompiler.html
    and http://mindprod.com/jgloss/disassembler.html

    --
    Bush crime family lost/embezzled $3 trillion from Pentagon.
    Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
    http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

    Canadian Mind Products, Roedy Green.
    See http://mindprod.com/iraq.html photos of Bush's war crimes
     
    Roedy Green, Jul 9, 2005
    #6
  7. Roedy Green

    Tim Tyler Guest

    Roedy Green <> wrote or quoted:

    > A similar question is just how long can an Identifier be? Natural
    > limits due to bit sizes for field lengths are 31, 255 and 32,767. I
    > suppose that could be an implementation detail.


    ``The length of field and method names, field and method descriptors, and
    other constant string values is limited to 65535 characters by the
    16-bit unsigned length item of the CONSTANT_Utf8_info structure
    (§4.4.7). Note that the limit is on the number of bytes in the encoding
    and not on the number of encoded characters. UTF-8 encodes some
    characters using two or three bytes. Thus, strings incorporating
    multibyte characters are further constrained.''

    - http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#88659
    --
    __________
    |im |yler http://timtyler.org/ Remove lock to reply.
     
    Tim Tyler, Jul 9, 2005
    #7
  8. Roedy Green wrote:
    > On Fri, 08 Jul 2005 13:17:38 -0500, Paul <> wrote or
    > quoted :
    >
    >
    >>Maybe "legal java identifier" isn't what you meant, but that the symbol
    >>is undefined.

    >
    >
    > I scanned my text books and the web and could not get a definitive
    > answer on just what chars are allowed in identifiers:
    > 1. in JVM byte code.
    > 2. in java source.
    >
    > I wanted not just to know what the current compiler lets you have, but
    > what the language standard guarantees.


    Well, did you try reading it?

    http://java.sun.com/docs/books/jls/second_edition/html/j.title.doc.html

    >
    > I suppose it can be tested by experiment. is eacute ok? Chinese
    > characters? math symbols? the \u notation is pretty ugly. I'd need a
    > unicode text editor to do the proper experiments.
    >
    > my personal rule has been to use nothing but A-Z a-z 0-9 and _ but
    > only the middle of constant names.
    >
    > A similar question is just how long can an Identifier be? Natural
    > limits due to bit sizes for field lengths are 31, 255 and 32,767. I
    > suppose that could be an implementation detail.
    >
    >


    HTH,
    Ray

    --
    XML is the programmer's duct tape.
     
    Raymond DeCampo, Jul 9, 2005
    #8
  9. Paul Bilnoski, Jul 9, 2005
    #9
  10. Roedy Green

    Roedy Green Guest

    On Sat, 09 Jul 2005 15:34:34 GMT, Raymond DeCampo
    <> wrote or quoted :

    >Well, did you try reading it?
    >
    >http://java.sun.com/docs/books/jls/second_edition/html/j.title.doc.html


    The relevant section is:

    http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#40625

    That's Patricia Shanahan's job. I detest reading such lawyerly
    documents that try their hardest to hide the plain meaning.

    A straight forward reading of the standard would say you CAN put - in
    your identifier names, but I know you can't.

    The example he gives of a Legal identifier violates the first java
    letter rule.

    Perhaps a lawyer can make sense of what they are trying say. For
    mortals a list of acceptable and unacceptable identifier with reason
    says for than pages of BNF or explanation.

    If the standard was literally true Java foolishly refused to reserve
    even the Unicode mathematical operators for future use.

    --
    Bush crime family lost/embezzled $3 trillion from Pentagon.
    Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
    http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

    Canadian Mind Products, Roedy Green.
    See http://mindprod.com/iraq.html photos of Bush's war crimes
     
    Roedy Green, Jul 10, 2005
    #10
  11. Roedy Green wrote:
    > On Sat, 09 Jul 2005 15:34:34 GMT, Raymond DeCampo
    > <> wrote or quoted :
    >
    >
    >>Well, did you try reading it?
    >>
    >>http://java.sun.com/docs/books/jls/second_edition/html/j.title.doc.html

    >
    >
    > The relevant section is:
    >
    > http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#40625
    >
    > That's Patricia Shanahan's job. I detest reading such lawyerly
    > documents that try their hardest to hide the plain meaning.
    >
    > A straight forward reading of the standard would say you CAN put - in
    > your identifier names, but I know you can't.


    I don't see where you would get this from the standard.

    >
    > The example he gives of a Legal identifier violates the first java
    > letter rule.


    I don't know which example you mean. They all seem fine to me.

    >
    > Perhaps a lawyer can make sense of what they are trying say. For
    > mortals a list of acceptable and unacceptable identifier with reason
    > says for than pages of BNF or explanation.
    >
    > If the standard was literally true Java foolishly refused to reserve
    > even the Unicode mathematical operators for future use.
    >


    I don't know where you are reading that into it.

    Actually, after posting the link, I went in and read the above section
    on my own. I was pretty disappointed that the real "specification" for
    what characters may be included was punted on by saying it depends on
    the results of java.lang.Character.isJavaIdentifierStart() and
    java.lang.Character.isJavaIdentifierPart().

    Delving into the documentation led me on a relatively uninteresting
    excursion into Unicode land.

    Ray
    --
    XML is the programmer's duct tape.
     
    Raymond DeCampo, Jul 11, 2005
    #11
  12. Roedy Green

    Dale King Guest

    Raymond DeCampo wrote:
    > Roedy Green wrote:
    >
    >
    >> Perhaps a lawyer can make sense of what they are trying say. For
    >> mortals a list of acceptable and unacceptable identifier with reason
    >> says for than pages of BNF or explanation.
    >>
    >> If the standard was literally true Java foolishly refused to reserve
    >> even the Unicode mathematical operators for future use.
    >>

    >
    > I don't know where you are reading that into it.
    >
    > Actually, after posting the link, I went in and read the above section
    > on my own. I was pretty disappointed that the real "specification" for
    > what characters may be included was punted on by saying it depends on
    > the results of java.lang.Character.isJavaIdentifierStart() and
    > java.lang.Character.isJavaIdentifierPart().
    >
    > Delving into the documentation led me on a relatively uninteresting
    > excursion into Unicode land.


    I think the reason they don't give you the definitive list is that list
    is not necessarily static. As characters get added to Unicode they can
    get added to the list of acceptable letters for Java identifiers. They
    don't want to update the language spec. as Unicode support expands in Java.

    How would they specify it anyway? It would take pages to list ll the
    characters.

    The rules are pretty broad. Almost any thing that is a letter or digit
    in Unicode is acceptable.

    The one area that Sun fails in this regard is the support for encodings
    to actually use this full Unicode set. They don't support the use of
    byte order marks at the start of a Java source file to indicate UTF-8,
    UTF-16BE, UTF-32, etc. Even Windows notepad supports that, but not Sun.
    All they give you is the -encoding option which is not good enough.

    --
    Dale King
     
    Dale King, Jul 16, 2005
    #12
  13. Roedy Green

    Dale King Guest

    Tim Tyler wrote:
    > Roedy Green <> wrote or quoted:
    >
    >
    >>A similar question is just how long can an Identifier be? Natural
    >>limits due to bit sizes for field lengths are 31, 255 and 32,767. I
    >>suppose that could be an implementation detail.

    >
    >
    > ``The length of field and method names, field and method descriptors, and
    > other constant string values is limited to 65535 characters by the
    > 16-bit unsigned length item of the CONSTANT_Utf8_info structure
    > (§4.4.7). Note that the limit is on the number of bytes in the encoding
    > and not on the number of encoded characters. UTF-8 encodes some
    > characters using two or three bytes. Thus, strings incorporating
    > multibyte characters are further constrained.''
    >
    > - http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html#88659


    Early on, 1.5 was supposed to include support for removing some of the
    class file size limitations (particularly only 64K for a method body),
    but somehow it didn't make the final cut.

    It's still being worked on under JSR202.
    --
    Dale King
     
    Dale King, Jul 16, 2005
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. -

    enum within an enum

    -, Jun 12, 2005, in forum: Java
    Replies:
    6
    Views:
    591
  2. Jerminia
    Replies:
    3
    Views:
    661
    Roedy Green
    Oct 7, 2005
  3. Ernst Murnleitner

    How to enum an enum?

    Ernst Murnleitner, Nov 12, 2003, in forum: C++
    Replies:
    5
    Views:
    508
    Rolf Magnus
    Nov 13, 2003
  4. Jeff Wood
    Replies:
    26
    Views:
    239
    Jeff Wood
    Oct 17, 2005
  5. Siemen Baader

    enlightenment libs + ruby

    Siemen Baader, Apr 16, 2009, in forum: Ruby
    Replies:
    0
    Views:
    329
    Siemen Baader
    Apr 16, 2009
Loading...

Share This Page