Philosophical question on collection reference types

Discussion in 'Java' started by Patricia Shanahan, Mar 1, 2007.

  1. I habitually use the interface types for java.util collection
    references:

    List<SomeClass> myList = new ArrayList<SomeClass>();

    However, in writing some code I started questioning this. The code uses
    frequent indexed access to lists with thousands of elements. I wrote it
    with the intention of using ArrayList, for fast random access.

    Would it perhaps be more appropriate to declare:

    ArrayList<SomeClass> myList = new ArrayList<SomeClass>();

    to indicate that the code is designed with an ArrayList in mind, and may
    need to be rewritten if the List class changes? The code would function,
    given sufficient patience, with any List implementation, but I would
    never design it that way if I were going to use a LinkedList.

    Patricia
    Patricia Shanahan, Mar 1, 2007
    #1
    1. Advertising

  2. Patricia Shanahan

    Chris Uppal Guest

    Patricia Shanahan wrote:

    > I habitually use the interface types for java.util collection
    > references:
    >
    > List<SomeClass> myList = new ArrayList<SomeClass>();
    >
    > However, in writing some code I started questioning this. The code uses
    > frequent indexed access to lists with thousands of elements. I wrote it
    > with the intention of using ArrayList, for fast random access.


    It's an interesting question, and in this case following the established dogma
    does work against making the code as communicative as possible. It is an
    example of hiding /too much/ information.

    Perhaps you could define:

    interface IndexedList
    extends
    java.util.List,
    java.util.RandomAccess
    {
    }

    and then write your code to require/promise an IndexedList instead of a List.

    If that doesn't work for you, then I think you have to choose which is the more
    likely of two possible futures.

    1) If you think you are likely to want a better List implementation than
    ArrayList (e.g. one that keeps its elements in fixed-size chunks rather than in
    one huge contiguous array), then use List. On this subject, note that the
    objects returned by Arrays.toList() and Collections.unmodifiableList() are
    Lists but are not ArrayLists (nor IndexedLists come to that).

    2) If you think you (or anyone) is likely to supply, or assume, an
    inappropriate implementation of List, such as LinkedList, then use ArrayList
    explicitly.

    At a guess, and only judging from what you've said about your work, (1) seems
    more likely to me than (2). So, if my guess is right, I'd use List. And, if I
    wanted to clarify the implicit assumptions my code was making, I would add an
    explanatory comment. Or even add:
    assert (list instanceof RandomAccess);

    -- chris
    Chris Uppal, Mar 1, 2007
    #2
    1. Advertising

  3. Patricia Shanahan

    Mark Space Guest

    Patricia Shanahan wrote:
    > I habitually use the interface types for java.util collection
    > references:
    >
    > List<SomeClass> myList = new ArrayList<SomeClass>();
    >
    > However, in writing some code I started questioning this. The code uses


    The first thing that occurs to me is that trying too hard to anticipate
    future requirements is likely to result in Gold Plating -- complexity
    that your application doesn't need and won't ever use.

    OTOH, if you think it's likely that your big data object may need to be
    implemented differently in the future, using an abstract type like List
    seems to make sense. List is "free" (somebody has already done the work
    to define it) and works with a variety of existing tools. Implementing
    your own List based on, for example, a Strategy Pattern and using memory
    mapped files, sounds easier than trying to do the same with an ArrayList.

    My two cents.
    Mark Space, Mar 1, 2007
    #3
  4. Mark Space wrote:
    > Patricia Shanahan wrote:
    >> I habitually use the interface types for java.util collection
    >> references:
    >>
    >> List<SomeClass> myList = new ArrayList<SomeClass>();
    >>
    >> However, in writing some code I started questioning this. The code uses

    >
    > The first thing that occurs to me is that trying too hard to anticipate
    > future requirements is likely to result in Gold Plating -- complexity
    > that your application doesn't need and won't ever use.


    I don't see the complexity difference between the two alternatives,
    declaring myList as a List or as an ArrayList.

    >
    > OTOH, if you think it's likely that your big data object may need to be
    > implemented differently in the future, using an abstract type like List
    > seems to make sense. List is "free" (somebody has already done the work
    > to define it) and works with a variety of existing tools. Implementing
    > your own List based on, for example, a Strategy Pattern and using memory
    > mapped files, sounds easier than trying to do the same with an ArrayList.


    I think it is VERY unlikely that I'll ever use anything other than
    ArrayList. ArrayList is fast enough for what I'm doing. Indeed, that is
    the point of my question - since it is an ArrayList, and is very
    unlikely to ever be anything other than an ArrayList, without major
    changes to the code that uses it, why not call it an ArrayList in the
    reference type declaration?

    This is primarily a question about programming as human-human
    communication, to either another programmer or my future self. I can
    change the declared type, in either direction, with very little
    programming effort.

    Patricia
    Patricia Shanahan, Mar 1, 2007
    #4
  5. Patricia Shanahan

    Daniel Pitts Guest

    On Mar 1, 2:51 pm, Patricia Shanahan <> wrote:
    > Mark Space wrote:
    > > Patricia Shanahan wrote:
    > >> I habitually use the interface types for java.util collection
    > >> references:

    >
    > >> List<SomeClass> myList = new ArrayList<SomeClass>();

    >
    > >> However, in writing some code I started questioning this. The code uses

    >
    > > The first thing that occurs to me is that trying too hard to anticipate
    > > future requirements is likely to result in Gold Plating -- complexity
    > > that your application doesn't need and won't ever use.

    >
    > I don't see the complexity difference between the two alternatives,
    > declaring myList as a List or as an ArrayList.
    >
    >
    >
    > > OTOH, if you think it's likely that your big data object may need to be
    > > implemented differently in the future, using an abstract type like List
    > > seems to make sense. List is "free" (somebody has already done the work
    > > to define it) and works with a variety of existing tools. Implementing
    > > your own List based on, for example, a Strategy Pattern and using memory
    > > mapped files, sounds easier than trying to do the same with an ArrayList.

    >
    > I think it is VERY unlikely that I'll ever use anything other than
    > ArrayList. ArrayList is fast enough for what I'm doing. Indeed, that is
    > the point of my question - since it is an ArrayList, and is very
    > unlikely to ever be anything other than an ArrayList, without major
    > changes to the code that uses it, why not call it an ArrayList in the
    > reference type declaration?
    >
    > This is primarily a question about programming as human-human
    > communication, to either another programmer or my future self. I can
    > change the declared type, in either direction, with very little
    > programming effort.
    >
    > Patricia


    Returning a List lets you keep the implementation abstract. If you
    document the fact that "the List returned will be a RandomAccess
    list", users of your class can feel safe using random access without
    worrying about it being an ArrayList, Arrays.toList(), or some other
    tbd list.

    I think its overkill to add implementation complexity to the names of
    things. You run the risk of not knowing where to stop:

    public interface ListWithConstantTimeLookup<T> {
    T getInConstantTime(int offset);
    void putInConstantTime(T o, int offset);
    }

    Alternatively, users who need random access but can't assume its
    available can do this:

    List<Foo> list = getAllFoo();
    if (!(list instanceof RandomAccess)) {
    list = Arrays.asList(list.toArray(new Foo[0]);
    // or list = new ArrayList(list);
    }

    or,
    List<Foo> list = getAllFoo();
    assert list instanceof RandomAccess;

    Although, I have to say that in most cases I'm able to deal with lists
    through iteration rather than by lookup. Often times I'll use
    Collection as the return type or parameter type, if it really doesn't
    matter whats going on.
    Daniel Pitts, Mar 2, 2007
    #5
  6. Daniel Pitts wrote:
    > On Mar 1, 2:51 pm, Patricia Shanahan <> wrote:

    ....
    >> This is primarily a question about programming as human-human
    >> communication, to either another programmer or my future self. I can
    >> change the declared type, in either direction, with very little
    >> programming effort.
    >>
    >> Patricia

    >
    > Returning a List lets you keep the implementation abstract. If you
    > document the fact that "the List returned will be a RandomAccess
    > list", users of your class can feel safe using random access without
    > worrying about it being an ArrayList, Arrays.toList(), or some other
    > tbd list.


    In this case the List is not being returned, but used locally to
    implement an algorithm that makes random accesses.

    > Although, I have to say that in most cases I'm able to deal with lists
    > through iteration rather than by lookup. Often times I'll use
    > Collection as the return type or parameter type, if it really doesn't
    > matter whats going on.


    I tend to use List or Set most of the time. Usually, there is some
    possibility that I'll change my mind about the implementing class
    without rewriting the code that uses it. This time, that is very unlikely.

    Patricia
    Patricia Shanahan, Mar 2, 2007
    #6
  7. Patricia Shanahan

    Daniel Pitts Guest

    On Mar 1, 8:36 pm, Patricia Shanahan <> wrote:
    > Daniel Pitts wrote:
    > > On Mar 1, 2:51 pm, Patricia Shanahan <> wrote:

    > ...
    > >> This is primarily a question about programming as human-human
    > >> communication, to either another programmer or my future self. I can
    > >> change the declared type, in either direction, with very little
    > >> programming effort.

    >
    > >> Patricia

    >
    > > Returning a List lets you keep the implementation abstract. If you
    > > document the fact that "the List returned will be a RandomAccess
    > > list", users of your class can feel safe using random access without
    > > worrying about it being an ArrayList, Arrays.toList(), or some other
    > > tbd list.

    >
    > In this case the List is not being returned, but used locally to
    > implement an algorithm that makes random accesses.

    If its a locally used collection, I can't think of an argument for
    either abstract or concrete, except for consistency sake. "She always
    returns List, but declares ArrayList inside her methods. That's
    confusing." I've seen people do this too (return List, but declare
    ArrayList for local variables)

    More to the point, my point was more about finding an alternative
    algorithm which can utilize iteration over random access. I realize
    that this isn't always feasible or even always possible, but for the
    majority of things that I come across, it is often possible.
    >
    > > Although, I have to say that in most cases I'm able to deal with lists
    > > through iteration rather than by lookup. Often times I'll use
    > > Collection as the return type or parameter type, if it really doesn't
    > > matter whats going on.

    >
    > I tend to use List or Set most of the time. Usually, there is some
    > possibility that I'll change my mind about the implementing class
    > without rewriting the code that uses it. This time, that is very unlikely.
    >
    > Patricia


    Well, even so, since the use is local, you can probably easily change
    the type anyway as long as the API remains the same. You wouldn't
    even need to worry about "interface" or "class". It quacks like a duck
    and walks like a duck. This is one argument for weakly-typed
    languages. An interface simply becomes what is defined by an object.
    It could be constructed, restructured, or even retrofitted at
    runtime.

    I guess my point is that really the only factor that I can see that
    really tips the scales is the "consistency" factor. Using the most
    generic/abstract type possible in most situations (where it DOES
    decouple classes), but in one case using a concrete type just because
    it makes something seem more clear to you*, versus being consistent
    all the way through. Humans like consistency.

    * I've been learning lately that even the best designed code can seem
    opaque to someone, and the worst design code can make perfect sense to
    someone. Its dangerous to assume that someone (even yourself) will
    understand your "perfect" design sometime in the future. :)

    Anyway, good luck,
    Daniel.
    Daniel Pitts, Mar 2, 2007
    #7
  8. Patricia Shanahan

    Piotr Kobzda Guest

    Patricia Shanahan wrote:

    > In this case the List is not being returned, but used locally to
    > implement an algorithm that makes random accesses.


    You might express that using generic method with a type parameter in
    form of intersection type.

    <E, L extends List<E> & RandomAccess> void processList(L list) {
    // implement your algorithm here ...
    }

    Intersection type may be used as well for method's return type, or even
    for local (or filed of generic class) declaration.


    However, there are some restrictions imposed by intersection types.

    For example, in processList() implementation you can not do that:

    L l = new ArrayList<E>();

    Compiler complains with: "Type mismatch: cannot convert from
    ArrayList<E> to L".

    That's confusing, especially in front of compiler's acceptance (without
    any warning) of:

    processList(new ArrayList<String>());


    I presume the reason is (as usually is such a cases) type erasure --
    what I wont even try to ensure...

    Simple workaround for that is to use unchecked cast:

    L l = (L) new ArrayList<E>();



    > I tend to use List or Set most of the time. Usually, there is some
    > possibility that I'll change my mind about the implementing class
    > without rewriting the code that uses it. This time, that is very unlikely.


    Than try with intersection type. (Honestly, I've never tried that, so
    I'm far from recommending it).


    piotr
    Piotr Kobzda, Mar 2, 2007
    #8
  9. Patricia Shanahan

    Lew Guest

    Chris Uppal wrote:
    > Patricia Shanahan wrote:
    >
    >> I habitually use the interface types for java.util collection
    >> references:
    >>
    >> List<SomeClass> myList = new ArrayList<SomeClass>();
    >>
    >> However, in writing some code I started questioning this. The code uses
    >> frequent indexed access to lists with thousands of elements. I wrote it
    >> with the intention of using ArrayList, for fast random access.

    >
    > It's an interesting question, and in this case following the established dogma
    > does work against making the code as communicative as possible. It is an
    > example of hiding /too much/ information.


    As I understand it, the established dogma is to use the widest type that does
    what you want. List is too wide for this, so use ArrayList.

    Like many other ideas, the oversimplification seems to stick in people's minds
    that one should "always use the interface". The real principle is to use the
    widest type that fits.

    -- Lew
    Lew, Mar 3, 2007
    #9
  10. Patricia Shanahan

    Lew Guest

    ArrayList promises to be Serializable, List does not.

    There are times when you should specify the implementation.

    Use the type that makes the promises you need. Don't ask for generality when
    you want specificity.

    -- Lew
    Lew, Mar 3, 2007
    #10
  11. Patricia Shanahan

    Chris Uppal Guest

    Lew wrote:

    > As I understand it, the established dogma is to use the widest type that
    > does what you want. List is too wide for this, so use ArrayList.


    Note of terminological difference: what you're calling wide is what I call the
    narrowest type -- the one which conveys the least information.


    > Like many other ideas, the oversimplification seems to stick in people's
    > minds that one should "always use the interface". The real principle is
    > to use the widest type that fits.


    In this case, though, Patricia /can/ use List, since it supports all the
    necessary operations. ArrayList would be overspecifying (or might be). The
    information that is, or isn't, being communicated isn't really type-related[*]
    at all, but is a quality-of-service aspect. It isn't enough just to be able to
    do <something>, but the object must be able to do <something> with certain
    performance guarantees.

    -- chris

    [*] "type" here being construed in the sense of Java's type system. It's
    possible to imagine a type system where asymptotic performance bounds were part
    of methods' type signatures.
    Chris Uppal, Mar 3, 2007
    #11
  12. Patricia Shanahan

    Lew Guest

    Chris Uppal wrote:
    > Note of terminological difference: what you're calling wide is what I call the
    > narrowest type -- the one which conveys the least information.


    So noted, and I was not being precise. My sense of "wide" means "encompasses
    the largest number of subtypes". By rights I should have said "highest"
    instead of "widest", thus conveying the usual depiction of top-rooted
    inheritance hierarchies and conforming to the notion of "upcasting".

    Good call.

    -- Lew
    Lew, Mar 3, 2007
    #12
  13. Patricia Shanahan

    Piotr Kobzda Guest

    Chris Uppal wrote:
    > Lew wrote:


    >> Like many other ideas, the oversimplification seems to stick in people's
    >> minds that one should "always use the interface". The real principle is
    >> to use the widest type that fits.


    Or simply -- use the least specialized type that fits.


    > In this case, though, Patricia /can/ use List, since it supports all the
    > necessary operations. ArrayList would be overspecifying (or might be). The
    > information that is, or isn't, being communicated isn't really type-related[*]
    > at all, but is a quality-of-service aspect. It isn't enough just to be able to
    > do <something>, but the object must be able to do <something> with certain
    > performance guarantees.


    > [*] "type" here being construed in the sense of Java's type system. It's
    > possible to imagine a type system where asymptotic performance bounds were part
    > of methods' type signatures.


    Agreed. However, I see no problem with type system defined like that.
    A lot of (less or more useful) "typologies" coexists in Java's type
    system (are they all, for example Serializable semantics, really
    type-related?), why then not to add that another one (i.e. performance
    aspect) too? Especially when that matters, and may become a source of
    not only poor quality of service, but even of denial of service (e.g.
    system not responding in "reasonable" time...).

    That type can be simply defined (the IndexedList from your previous post
    is an example). But it holds obvious disadvantage, classes like
    ArralList must additionally implement it to become useful.

    We can also define that type "virtually" using type intersection --
    which I can't point out important disadvantages for.


    What's then the argument against using intersection type here?


    BTW -- I've just explored a bit deeper my advice from previous post, and
    it results in conclusion, that intersection type can be used even
    simpler in that case, e.g. that way:

    <L extends List<?> & RandomAccess> void processList(L list)

    or, of course, when a concrete list elements type is know, that way:

    <L extends List<String> & RandomAccess> void processList(L list)


    Nevertheless, my previous parameterization with two type variables, may
    still appear useful when we need additional constrains on list's element
    types, or when exact return type of list is unknown, and we don't want
    to lose any information about it from method's return type, e.g.:

    <E, L extends List<E> & RandomAccess> L filter(L list) {
    // ...
    return list;
    }

    Although, usage of such a method is probably slightly limited (instead
    of method-chaining and the like), it's /at least/ worth to be noted as
    an alternative declaration.

    Inconvenience here is when we want to have, for example, a generic
    factory method like that:

    <E, L extends List<E> & RandomAccess> L createRandomAccessList() {
    return (L) new ArrayList<E>(); // unchecked cast
    }

    Which, despite of that ugly unchecked cast, seems to be useful at first,
    and it even works as expected, but holds a small problem... We can not
    directly pass its result as an input for another method which it should
    be compatible with -- the following won't compile:

    processList(createRandomAccessList());

    That's because type inference takes place here, so without any
    additional advice compiler fails with that.

    Hopeless, while Java generics is implemented using type erasure (and
    capture conversion, and so on...), we (I) can't even workaround it, i.e.
    to make it working without explicitly provided type argument...


    But let's beck to the original problem...

    Don't you think, that in Patricia's particular case, /possibly/ neither
    the use of List nor ArrayList is better than using intersection type of
    List and RandomAccess?


    piotr
    Piotr Kobzda, Mar 5, 2007
    #13
  14. Patricia Shanahan

    Piotr Kobzda Guest

    Piotr Kobzda wrote:

    > Inconvenience here is when we want to have, for example, a generic
    > factory method like that:
    >
    > <E, L extends List<E> & RandomAccess> L createRandomAccessList() {
    > return (L) new ArrayList<E>(); // unchecked cast
    > }
    >
    > Which, despite of that ugly unchecked cast, seems to be useful at first,
    > and it even works as expected, but holds a small problem...


    Well. It's not a small problem here (and it's not /that/ problem in
    fact), it's rather bigger one... Here is a mistake in my treatment of
    second type parameter, not a generics implementation nor a compiler
    fault. The ArrayList<E> is not valid substitute for all possible
    intersection types of List<E> and RandomAccess, thus the code above is
    simply incorrect.

    > We can not
    > directly pass its result as an input for another method which it should
    > be compatible with -- the following won't compile:
    >
    > processList(createRandomAccessList());
    >
    > That's because type inference takes place here, so without any
    > additional advice compiler fails with that.
    >
    > Hopeless, while Java generics is implemented using type erasure (and
    > capture conversion, and so on...), we (I) can't even workaround it, i.e.
    > to make it working without explicitly provided type argument...


    However, it can be done correctly using factory object implementing the
    following-like interface:

    interface RandomAccessListFactory<E, L extends List<E> &
    RandomAccess> {
    L newList();
    }

    We can then create its implementation like that:

    static <E> RandomAccessListFactory<E, ? extends List<E>>
    defaultListFactory() {
    return new RandomAccessListFactory<E, ArrayList<E>>() {
    public ArrayList<E> newList() {
    return new ArrayList<E>();
    }
    };
    }

    and then use it, that way:

    processList(defaultListFactory().newList());

    The compiler nicely infers all types as needed. (Notice no need for
    explicit declaration of ArrayList beside the factory scope).


    The above is wide extension, and possibly very far from, the point of
    the original question. It simply shows, that working with intersection
    types is not as straight forward as working with the regular Java types
    (more, it still appears to me as a little abuse of the intersection
    types), but it seams to be possible.

    Of course, I'm still far from recommending it -- just wondering, how to
    efficient work with that, and if it all is worth the effort?...


    piotr
    Piotr Kobzda, Mar 6, 2007
    #14
  15. Patricia Shanahan

    Chris Uppal Guest

    Piotr Kobzda wrote:

    > That type can be simply defined (the IndexedList from your previous post
    > is an example). But it holds obvious disadvantage, classes like
    > ArralList must additionally implement it to become useful.
    >
    > We can also define that type "virtually" using type intersection --
    > which I can't point out important disadvantages for.
    >
    >
    > What's then the argument against using intersection type here?


    Right now, I have no real argument against it -- the reason I didn't mention it
    myself is that it simply didn't occur to me that you could express the dual
    requirement in that way.

    Besides the fact (unfortunate but true) that I tend to forget that intersection
    types are available at all, I don't feel confident enough in my "feel" for what
    knock-on effects a given declaration will have (especially a relatively
    "advanced" one), to be able to adocate using or avoiding that approach. I
    would be a good deal happier if intersection types could be declared as
    "top-level" constuctions -- the way they are resticted to use in type bounds
    makes me uneasy, but I don't yet know whether that unease is justified.

    I want think about this some more, and do some experiments -- it's a good
    chance to learn a bit...

    -- chris
    Chris Uppal, Mar 6, 2007
    #15
  16. Patricia Shanahan

    Daniel Pitts Guest

    On Mar 6, 9:24 am, "Chris Uppal" <-
    THIS.org> wrote:
    > Piotr Kobzda wrote:
    > > That type can be simply defined (the IndexedList from your previous post
    > > is an example). But it holds obvious disadvantage, classes like
    > > ArralList must additionally implement it to become useful.

    >
    > > We can also define that type "virtually" using type intersection --
    > > which I can't point out important disadvantages for.

    >
    > > What's then the argument against using intersection type here?

    >
    > Right now, I have no real argument against it -- the reason I didn't mention it
    > myself is that it simply didn't occur to me that you could express the dual
    > requirement in that way.
    >
    > Besides the fact (unfortunate but true) that I tend to forget that intersection
    > types are available at all, I don't feel confident enough in my "feel" for what
    > knock-on effects a given declaration will have (especially a relatively
    > "advanced" one), to be able to adocate using or avoiding that approach. I
    > would be a good deal happier if intersection types could be declared as
    > "top-level" constuctions -- the way they are resticted to use in type bounds
    > makes me uneasy, but I don't yet know whether that unease is justified.
    >
    > I want think about this some more, and do some experiments -- it's a good
    > chance to learn a bit...
    >
    > -- chris


    It seems that it would be a useful Java syntax to allow type
    intersections in regular variable declarations.

    public <E> void myFunction(List<E>&RandomAccess randomAccessList)
    {
    }

    It would be especially useful for this type of use:
    public void myFunction() {
    Runnable&Comparable<?> myRunner = getNextRunner();
    myRunner.run();
    return myRunner.compareTo(getNextRunner());
    }

    Note that I use methods in both the Runnable interface and Comparable
    interface.

    I'm not sure I like the syntax, but it could be worse :).

    My examples may not seem useful, but imagine the power with custom
    interfaces, rather than built in ones.
    Daniel Pitts, Mar 6, 2007
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dave Rahardja
    Replies:
    11
    Views:
    504
    Risto Lankinen
    Aug 7, 2003
  2. Giannis Papadopoulos

    void vs void* (philosophical question)

    Giannis Papadopoulos, Jun 30, 2006, in forum: C Programming
    Replies:
    18
    Views:
    502
    Dave Thompson
    Jul 10, 2006
  3. Øyvind Isaksen
    Replies:
    1
    Views:
    936
    Øyvind Isaksen
    May 18, 2007
  4. neel
    Replies:
    2
    Views:
    236
    Pavel
    Jun 14, 2009
  5. Phil Bradby

    Philosophical question about C

    Phil Bradby, Jan 20, 2010, in forum: C Programming
    Replies:
    28
    Views:
    750
    Nick Keighley
    Feb 2, 2010
Loading...

Share This Page