Size of an arraylist in bytes

Discussion in 'Java' started by sara, Nov 20, 2011.

  1. sara

    sara Guest

    Hi All,

    I create an Arraylist<Integer> tmp and add some integers to it.
    Afterward, I measure the size of tmp in bytes (by converting tmp to
    bytes array). Assume the result is byte[] C. However, when I update an
    element of tmp, and measure size of tmp in bytes again, the result is
    different than C!
    Why this is the case?

    Best
    Sara
     
    sara, Nov 20, 2011
    #1
    1. Advertising

  2. sara

    markspace Guest

    On 11/20/2011 1:01 PM, sara wrote:
    > Hi All,
    >
    > I create an Arraylist<Integer> tmp and add some integers to it.
    > Afterward, I measure the size of tmp in bytes (by converting tmp to
    > bytes array). Assume the result is byte[] C. However, when I update an
    > element of tmp, and measure size of tmp in bytes again, the result is
    > different than C!
    > Why this is the case?



    We'd have to see some code to give you a good answer, but basically you
    can't measure the memory size of Java objects. They change over time,
    in ways that C or C++ can't or doesn't, and there's not much to do that
    can rectify that.
     
    markspace, Nov 20, 2011
    #2
    1. Advertising

  3. sara

    sara Guest

    Here is the code:

    ArrayList<Integer> tmp=new ArrayList<Integer>();
    tmp.add(-1);
    tmp.add(-1);
    System.out.println(DiGraph.GetBytes(tmp).length);
    tmp.set(0, 10);
    System.out.println(DiGraph.GetBytes(tmp).length);


    public static byte[] GetBytes(Object v) {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutputStream oos;
    try {
    oos = new ObjectOutputStream(bos);
    oos.writeObject(v);
    oos.flush();
    oos.close();
    bos.close();
    } catch (IOException e) {
    e.printStackTrace();
    }
    byte[] data = bos.toByteArray();
    return data;
    }

    The problem is I need to write multiple arraylists on disk and later
    on I update the elements of them. I store the starting location of
    arraylists and their size such that later I can refer to them. If the
    size of objects change then it messes up! Could you please help?
    On Nov 20, 1:05 pm, markspace <-@.> wrote:
    > On 11/20/2011 1:01 PM, sara wrote:
    >
    > > Hi All,

    >
    > > I create an Arraylist<Integer>  tmp and add some integers to it.
    > > Afterward, I measure the size of tmp in bytes (by converting tmp to
    > > bytes array). Assume the result is byte[] C. However, when I update an
    > > element of tmp, and measure size of tmp in bytes again, the result is
    > > different than C!
    > > Why this is the case?

    >
    > We'd have to see some code to give you a good answer, but basically you
    > can't measure the memory size of Java objects.  They change over time,
    > in ways that C or C++ can't or doesn't, and there's not much to do that
    > can rectify that.
     
    sara, Nov 20, 2011
    #3
  4. sara

    Eric Sosman Guest

    On 11/20/2011 4:01 PM, sara wrote:
    > Hi All,
    >
    > I create an Arraylist<Integer> tmp and add some integers to it.
    > Afterward, I measure the size of tmp in bytes (by converting tmp to
    > bytes array). Assume the result is byte[] C. However, when I update an
    > element of tmp, and measure size of tmp in bytes again, the result is
    > different than C!
    > Why this is the case?


    See markspace's response. Another possible point of confusion:
    The ArrayList does not actually contain objects, but references to
    those objects -- that's why the same object instance can be in three
    ArrayLists, two Sets, and a Map simultaneously. In fact, the same
    Integer object could appear forty-two times in a single ArrayList:

    List<Integer> list = new ArrayList<Integer>();
    Integer number = Integer.valueOf(42);
    for (int i = 0; i < 42; ++i)
    list.add(number);

    If you're coming from a C background, a rough analogy is that
    the ArrayList holds "pointers" to the objects it holds, not copies
    of those objects.

    --
    Eric Sosman
    d
     
    Eric Sosman, Nov 20, 2011
    #4
  5. sara

    sara Guest

    On Nov 20, 1:30 pm, Eric Sosman <> wrote:
    > On 11/20/2011 4:01 PM, sara wrote:
    >
    > > Hi All,

    >
    > > I create an Arraylist<Integer>  tmp and add some integers to it.
    > > Afterward, I measure the size of tmp in bytes (by converting tmp to
    > > bytes array). Assume the result is byte[] C. However, when I update an
    > > element of tmp, and measure size of tmp in bytes again, the result is
    > > different than C!
    > > Why this is the case?

    >
    >      See markspace's response.  Another possible point of confusion:
    > The ArrayList does not actually contain objects, but references to
    > those objects -- that's why the same object instance can be in three
    > ArrayLists, two Sets, and a Map simultaneously.  In fact, the same
    > Integer object could appear forty-two times in a single ArrayList:
    >
    >         List<Integer> list = new ArrayList<Integer>();
    >         Integer number = Integer.valueOf(42);
    >         for (int i = 0; i < 42; ++i)
    >             list.add(number);
    >
    >      If you're coming from a C background, a rough analogy is that
    > the ArrayList holds "pointers" to the objects it holds, not copies
    > of those objects.
    >
    > --
    > Eric Sosman
    >


    But do you have any answer to my second question?
     
    sara, Nov 20, 2011
    #5
  6. sara

    Stefan Ram Guest

    Eric Sosman <> writes:
    > If you're coming from a C background, a rough analogy is that
    >the ArrayList holds "pointers" to the objects it holds, not copies
    >of those objects.


    An ArrayList /does/ hold pointers (in the sense of Java),
    this is not just »a rough analogy«:

    »(...) reference values (...) are pointers«

    JLS3, 4.3.1.

    http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.3.1
     
    Stefan Ram, Nov 20, 2011
    #6
  7. sara <> wrote:
    > Here is the code:
    > ArrayList<Integer> tmp=new ArrayList<Integer>();
    > tmp.add(-1);
    > tmp.add(-1);
    > System.out.println(DiGraph.GetBytes(tmp).length);
    > tmp.set(0, 10);
    > System.out.println(DiGraph.GetBytes(tmp).length);
    >
    > public static byte[] GetBytes(Object v) {
    > ByteArrayOutputStream bos = new ByteArrayOutputStream();
    > ObjectOutputStream oos;
    > try {
    > oos = new ObjectOutputStream(bos);
    > oos.writeObject(v);


    The serialization output size of an ArrayList<Integer> depends on
    more than just the number of Integer elements in the array. There
    is the capacity, which may be larger than the size, but what really
    spoils it for you is the Integer-objects, which get serialized along
    with the array. If both are same, only one Integer-object gets saved,
    but if you change the value for one, then you get two different
    Integer-objects serialized along with the actual array, and thus
    you get more bytes.

    If you need fixed-size records for your arrays (assuming a fixed
    size() ), you might be more lucky with arrays of primitives:

    If you had:
    int[] = new int[2]; tmp[0]=-1; tmp[1]=-1;
    and dump that array onto oos, then change tmp[0]=0;
    it's very likely, you'll see the same number of bytes
    dumped, afterwards.

    > oos.flush();
    > oos.close();
    > bos.close();
    > } catch (IOException e) {
    > e.printStackTrace();
    > }
    > byte[] data = bos.toByteArray();
    > return data;
    > }
    >
    > The problem is I need to write multiple arraylists on disk and later
    > on I update the elements of them. I store the starting location of
    > arraylists and their size such that later I can refer to them. If the
    > size of objects change then it messes up! Could you please help?
     
    Andreas Leitgeb, Nov 20, 2011
    #7
  8. sara

    Eric Sosman Guest

    On 11/20/2011 4:44 PM, Stefan Ram wrote:
    > Eric Sosman<> writes:
    >> If you're coming from a C background, a rough analogy is that
    >> the ArrayList holds "pointers" to the objects it holds, not copies
    >> of those objects.

    >
    > An ArrayList /does/ hold pointers (in the sense of Java),
    > this is not just »a rough analogy«:
    >
    > »(...) reference values (...) are pointers«


    They're "pointers" in Java's terms, but Java is considerably
    more restrictive about what you can do with a "pointer" than C is.
    You cannot, for example, print the value of a Java reference; you
    can do so in C. You cannot convert a Java reference to or from an
    integer; C allows it (with traps for the unwary). Java references
    obey a type hierarchy; C's types (and hence the pointers to them)
    are unrelated. And so on, and so on: Little niggly differences.
    Since Java's references support (and prohibit) a different set of
    operations than C's pointers do, I maintain they're as similar as
    dogs and wolves, and as different.

    Put it this way: If I had told sara "An ArrayList contains
    C-style pointers to the objects it holds," would I have been
    telling the truth?

    --
    Eric Sosman
    d
     
    Eric Sosman, Nov 20, 2011
    #8
  9. sara

    markspace Guest

    On 11/20/2011 1:11 PM, sara wrote:

    > The problem is I need to write multiple arraylists on disk and later
    > on I update the elements of them. I store the starting location of
    > arraylists and their size such that later I can refer to them. If the
    > size of objects change then it messes up! Could you please help?



    Yes, this is the problem. You have to use something different from an
    ArrayList, because the ArrayList does change size.

    Look into plain arrays, IntBuffer, DataInputStream and DataOutputStream.

    It would also help now if we knew why you want to store multiple
    ArraysLists on disk. What is it you are trying to do?
     
    markspace, Nov 20, 2011
    #9
  10. sara

    Arne Vajhøj Guest

    On 11/20/2011 5:04 PM, Patricia Shanahan wrote:
    > On 11/20/2011 1:58 PM, Eric Sosman wrote:
    > ...
    >> Put it this way: If I had told sara "An ArrayList contains
    >> C-style pointers to the objects it holds," would I have been
    >> telling the truth?
    >>

    >
    > No, but if you had said "An ArrayList contains pointers to the objects
    > it holds." you would have been telling the exact truth.


    Yes.

    > The baggage that C added to pointers was an unfortunate aberration, not
    > something that should ever be considered to be the default definition of
    > "pointer".


    C/C++ pointers has certainly caused a lot of problems over the
    years.

    But the languages would not have been the same without them. And
    I even doubt that they would have been as popular.

    C and C++ was not chosen because alternatives without
    "do anything you want pointers" did not exist.

    Arne
     
    Arne Vajhøj, Nov 20, 2011
    #10
  11. sara

    Eric Sosman Guest

    On 11/20/2011 4:35 PM, sara wrote:
    >[...]
    > But do you have any answer to my second question?


    Only that you're going about it wrong. As Andreas Leitgeb points
    out, serializing an object is a different proposition than serializing
    a bunch of "raw" values: It saves enough information to reconstruct an
    "image" of the original object, with the same structure.

    What do I mean by "structure?" Something like this:

    Integer x = new Integer(42);
    Integer y = new Integer(42);

    Here we have two distinct Integer instances, each with the value 42.

    ArrayList<Integer> one = new ArrayList<Integer>();
    one.add(x);
    one.add(x);

    The first ArrayList holds one of the Integer instances, twice, and
    has nothing to do with the other.

    ArrayList<Integer> two = new ArrayList<Integer>();
    two.add(x);
    two.add(y);

    The second ArrayList holds both Integer instances, once each.

    If you serialize `one' and read it back again, you'll get an
    ArrayList with two references to the same Integer. Reading it back
    will produce one Integer, not two. There will be two objects in
    the serialized stream: One ArrayList and one Integer, plus enough
    additional information to reassemble them. (Actually, there will
    probably be additional objects: The ArrayList owns an array, which
    is an object in its own right, and perhaps there might be others.
    But there'll be two "visible" objects in the stream.)

    If you serialize `two' and read it back, you'll get an ArrayList
    with two references to two distinct Integers: Three "visible" objects
    in all.

    It's all right to serialize an object graph and store it on disk.
    It is *not* all right to try to update the serialization in place,
    nor to modify the object and expect a re-serialization to have the
    same size. If you need in-place operations or same-size guarantees,
    you'll need to invent a different external representation for your data.

    --
    Eric Sosman
    d
     
    Eric Sosman, Nov 20, 2011
    #11
  12. sara

    Stefan Ram Guest

    Arne Vajhøj <> writes:
    >C/C++ pointers has certainly caused a lot of problems over the
    >years.


    C serves as a »portable, abstract machine language«, so the
    C pointers are inherited machine addresses from machine
    languages, where one can freely add machine addresses and
    numbers. But, after all, C already adds some type safety and
    abstraction. So, C still makes sense as the first layer on
    top of the bare metal. And C cannot be blamed for someone
    choosing C where it is not appropriate.

    »=head2 What language is Parrot written in?

    C.

    =head2 For the love of God, man, why?!?!?!?

    Because it's the best we've got.«

    http://www.davidcole.net/msie/notes/ipl/perl/jul13/parrot/parrot-0.0.4/docs/faq.pod

    »Here's the thing: C is everywhere. Recently Tim Bray
    made basically the same point; all the major operating
    systems, all the high-level language runtimes, all the
    databases, and all major productivity applications are
    written in C.«

    http://girtby.net/archives/2008/08/23/in-defence-of-c/
     
    Stefan Ram, Nov 20, 2011
    #12
  13. sara

    Arne Vajhøj Guest

    On 11/20/2011 4:11 PM, sara wrote:
    > Here is the code:
    >
    > ArrayList<Integer> tmp=new ArrayList<Integer>();
    > tmp.add(-1);
    > tmp.add(-1);
    > System.out.println(DiGraph.GetBytes(tmp).length);
    > tmp.set(0, 10);
    > System.out.println(DiGraph.GetBytes(tmp).length);
    >
    >
    > public static byte[] GetBytes(Object v) {
    > ByteArrayOutputStream bos = new ByteArrayOutputStream();
    > ObjectOutputStream oos;
    > try {
    > oos = new ObjectOutputStream(bos);
    > oos.writeObject(v);
    > oos.flush();
    > oos.close();
    > bos.close();
    > } catch (IOException e) {
    > e.printStackTrace();
    > }
    > byte[] data = bos.toByteArray();
    > return data;
    > }


    That code measure the size of ArrayList serialized.

    It does not reflect how much it take up in memory.

    And you should not user serialization for persistent
    storage.

    Arne
     
    Arne Vajhøj, Nov 20, 2011
    #13
  14. sara

    Stefan Ram Guest

    Patricia Shanahan <> writes:
    >My main concern with C's pointers is that they were called "pointers",
    >not "addresses". They behave far more like assembly language addresses
    >than like something more abstract, whose only job is to point.


    Actually, they /are/ called »addresses«:

    »An object exists, has a constant address, retains its
    last-stored value throughout its lifetime.«

    ISO/IEC 9899:1999 (E), 6.2.4p2

    »The unary & operator returns the address of its operand.«

    ISO/IEC 9899:1999 (E), 6.5.3.2p3

    »it is permitted to take the address of a library function«

    ISO/IEC 9899:1999 (E), 7.1.4p1

    The language cannot be blamed for persons calling addresses
    »pointers«.

    However, ISO/IEC 9899:1999 (E) also /does/ contain the word
    »pointer«, but a »pointer« is /an object/ that contains an
    address value.

    At least some programmers read this from:

    »A pointer type describes an object whose value provides
    a reference to an entity of the referenced type.«

    ISO/IEC 9899:1999 (E), 6.2.5, #20

    However, nowhere does ISO/IEC 9899:1999 (E) give an explicit
    definition of »pointer«, and the usage of this document with
    regard to the word »pointer« is not always consistent.

    But the first three quotations should give you enough rights to
    speak of »addresses« of objects and functions in a C context.
     
    Stefan Ram, Nov 20, 2011
    #14
  15. sara

    Lew Guest

    On Sunday, November 20, 2011 1:11:00 PM UTC-8, sara wrote:
    > Here is the code:
    >
    > ArrayList<Integer> tmp=new ArrayList<Integer>();


    *DO NOT USE TAB CHARACTERS TO INDENT USENET CODE LISTINGS!*

    > tmp.add(-1);
    > tmp.add(-1);
    > System.out.println(DiGraph.GetBytes(tmp).length);
    > tmp.set(0, 10);
    > System.out.println(DiGraph.GetBytes(tmp).length);
    >
    >
    > public static byte[] GetBytes(Object v) {
    > ByteArrayOutputStream bos = new ByteArrayOutputStream();
    > ObjectOutputStream oos;
    > try {
    > oos = new ObjectOutputStream(bos);
    > oos.writeObject(v);
    > oos.flush();
    > oos.close();
    > bos.close();
    > } catch (IOException e) {
    > e.printStackTrace();
    > }
    > byte[] data = bos.toByteArray();
    > return data;
    > }
    >
    > The problem is I need to write multiple arraylists on disk and later


    The problem is that the code you posted won't compile.

    > on I update the elements of them. I store the starting location of
    > arraylists and their size such that later I can refer to them. If the
    > size of objects change then it messes up! Could you please help?


    Java changes the sizes of things in surprising ways, and makes no promises about the size of an 'ArrayList' in the way you're asking.

    What do you really want to do?

    > On Nov 20, 1:05 pm, markspace <-@.> wrote:


    *DO NOT TOP-POST!*

    --
    Lew
     
    Lew, Nov 21, 2011
    #15
  16. sara

    Lew Guest

    Eric Sosman wrote:
    > Stefan Ram wrote:
    >> Eric Sosman writes:
    >>> If you're coming from a C background, a rough analogy is that
    >>> the ArrayList holds "pointers" to the objects it holds, not copies
    >>> of those objects.

    >>
    >> An ArrayList /does/ hold pointers (in the sense of Java),
    >> this is not just »a rough analogy«:
    >>
    >> »(...) reference values (...) are pointers«

    >
    > They're "pointers" in Java's terms, but Java is considerably


    They're "pointers" in programming terms, not just Java's.

    > more restrictive about what you can do with a "pointer" than C is.


    So?

    > You cannot, for example, print the value of a Java reference; you
    > can do so in C. You cannot convert a Java reference to or from an
    > integer; C allows it (with traps for the unwary). Java references
    > obey a type hierarchy; C's types (and hence the pointers to them)
    > are unrelated. And so on, and so on: Little niggly differences.
    > Since Java's references support (and prohibit) a different set of
    > operations than C's pointers do, I maintain they're as similar as
    > dogs and wolves, and as different.


    Dogs and wolves are the same species. They can interbreed.

    Java pointers *are* pointers - and that's all they are. They don't pretendto do arithmetic on themselves. That does not make them less a pointer.

    The essence of pointers is that they point. The implicit 'const' on them (in C terms) doesn't change that a jot.

    > Put it this way: If I had told sara "An ArrayList contains
    > C-style pointers to the objects it holds," would I have been
    > telling the truth?


    Why would you say such a bone-headed thing, and what difference does it make? A pointer is a pointer still, if it but points, though you cannot increment it.

    No one is claiming that they're "C-style" pointers. so we'll throw that redherring back in the water.

    --
    Lew
     
    Lew, Nov 21, 2011
    #16
  17. sara

    Roedy Green Guest

    On Sun, 20 Nov 2011 13:01:44 -0800 (PST), sara <>
    wrote, quoted or indirectly quoted someone who said :

    >I create an Arraylist<Integer> tmp and add some integers to it.
    >Afterward, I measure the size of tmp in bytes (by converting tmp to
    >bytes array). Assume the result is byte[] C. However, when I update an
    >element of tmp, and measure size of tmp in bytes again, the result is
    >different than C!
    >Why this is the case?


    What code did you use to convert to byte[]?

    An ArrayList consists of a base ArrayList object, a array of pointers
    object, and one object for each integer. If the integers are small,
    e.g. two 1s in the list will point to the same canonical Integer
    object.

    Each object (including all the Integers) has perhaps 8 to 16 bytes of
    overhead. So it is fairly complicated to figure out how much RAM this
    thing uses. It is not like a C array where you just multiply 4xslots.

    An int[] is much simpler.

    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    I can't come to bed just yet. Somebody is wrong on the Internet.
     
    Roedy Green, Nov 21, 2011
    #17
  18. sara

    Stefan Ram Guest

    Newsgroups: comp.lang.java.programmer,comp.lang.c
    Followup-To: comp.lang.c

    Patricia Shanahan <> writes:
    >My main concern with C's pointers is that they were called "pointers",
    >not "addresses". They behave far more like assembly language addresses
    >than like something more abstract, whose only job is to point.


    An important difference between address arithmetics
    and pointer arithmetics can be seen here:

    #include <stdio.h>

    int addressdiff( void const * const b, void const * const a )
    { return( char const * const )b -( char const * const )a; }

    int main( void )
    { char address[ 2 ];
    int pointer[ 2 ];
    printf( "%d\n", addressdiff( address + 1, address ));
    printf( "%d\n", addressdiff( pointer + 1, pointer )); }

    1
    4

    Newsgroups: comp.lang.java.programmer,comp.lang.c
    Followup-To: comp.lang.c
     
    Stefan Ram, Nov 21, 2011
    #18
  19. sara

    Arne Vajhøj Guest

    On 11/21/2011 1:25 AM, Roedy Green wrote:
    > On Sun, 20 Nov 2011 13:01:44 -0800 (PST), sara<>
    > wrote, quoted or indirectly quoted someone who said :
    >
    >> I create an Arraylist<Integer> tmp and add some integers to it.
    >> Afterward, I measure the size of tmp in bytes (by converting tmp to
    >> bytes array). Assume the result is byte[] C. However, when I update an
    >> element of tmp, and measure size of tmp in bytes again, the result is
    >> different than C!
    >> Why this is the case?

    >
    > What code did you use to convert to byte[]?


    The code was posted in a followup.

    Arne
     
    Arne Vajhøj, Nov 26, 2011
    #19
  20. sara

    Arne Vajhøj Guest

    On 11/20/2011 5:48 PM, Patricia Shanahan wrote:
    > On 11/20/2011 2:42 PM, Stefan Ram wrote:
    >> Arne Vajhøj<> writes:
    >>> C/C++ pointers has certainly caused a lot of problems over the
    >>> years.

    >>
    >> C serves as a »portable, abstract machine language«, so the
    >> C pointers are inherited machine addresses from machine
    >> languages, where one can freely add machine addresses and
    >> numbers. But, after all, C already adds some type safety and
    >> abstraction. So, C still makes sense as the first layer on
    >> top of the bare metal. And C cannot be blamed for someone
    >> choosing C where it is not appropriate.

    >
    > My main concern with C's pointers is that they were called "pointers",
    > not "addresses". They behave far more like assembly language addresses
    > than like something more abstract, whose only job is to point.


    Since C does not have both constructs, then it is pure terminology.

    Arne
     
    Arne Vajhøj, Nov 26, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Saravanan Rathinavelu

    Iterate through ArrayList using an another ArrayList

    Saravanan Rathinavelu, Aug 16, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    2,742
    Natty Gur
    Aug 19, 2003
  2. Kaidi
    Replies:
    4
    Views:
    2,385
    Kaidi
    Jan 3, 2004
  3. Replies:
    5
    Views:
    549
    Flash Gordon
    Apr 9, 2006
  4. Replies:
    8
    Views:
    506
    Bob Hairgrove
    Apr 10, 2006
  5. Stefan Ram

    Re: Size of an arraylist in bytes

    Stefan Ram, Nov 21, 2011, in forum: C Programming
    Replies:
    77
    Views:
    1,217
    Tim Rentsch
    Mar 8, 2012
Loading...

Share This Page