What's the best/fastest way to access image data

Discussion in 'Java' started by G.W. Lucas, Nov 2, 2009.

  1. G.W. Lucas

    G.W. Lucas Guest

    I have an application that performs some specialized image-processing
    which is simple, but not supported by the JAI or other Java API. I
    pull in data from an existing image, process it, and store it back in
    a new image.

    For my application, I am using the Java BufferedImage class for images
    that are about 2 megapixels in size. The processing is requiring about
    250 milliseconds, which isn't bad, though I my application will be
    processing a LOT of images and is user-interactive, so I'd like to
    trim that if I can.

    Anyway, I added some more instrumentation to my time measurements and
    realized that of that 250 milliseconds, 200 or so was happening in the
    BufferedImage.getRGB() method that I was using to extract the raw data
    from the image:

    long time0 = System.currentTimeMillis();
    int w = image.getWidth();
    int h = image.getHeight();
    int n = w * h;
    int[] rgb = image.getRGB(0, 0, w, h, new int[n], 0, w);
    long time1 = System.currentTimeMillis();
    long accessTime = time1-time0;

    I found the fact that the access took so much longer than my own
    processing kind of a surprise... Ordinarily, when I see an unexpected
    result like this, it's usually a indication that I'm doing the wrong
    thing or using the wrong tool.

    So, I was wondering if I might be using the wrong approach in pulling
    out the raw data. Perhaps BufferedImage isn't even the right class to
    use? I've read the API document on the many Java image classes and
    find the nuances of "which one to use when" to be rather non-obvious.

    Could anyone point me in the direction of the best way to do this? Is
    there a web page that provides information about the design of the
    image classes that might clarify this issue.

    Thanks for your help.

    Gary
    G.W. Lucas, Nov 2, 2009
    #1
    1. Advertising

  2. G.W. Lucas

    markspace Guest

    G.W. Lucas wrote:

    > Anyway, I added some more instrumentation to my time measurements and
    > realized that of that 250 milliseconds, 200 or so was happening in the
    > BufferedImage.getRGB() method that I was using to extract the raw data
    > int[] rgb = image.getRGB(0, 0, w, h, new int[n], 0, w);



    I'm not sure, but several of the Java image APIs are asynchronous. They
    return immediately and load in the background. You're getting the whole
    image here so it's possible that read() is waiting on IO to complete, I
    suppose.

    Can you produce a complete example? I realize that you can't really
    send an image, but if you could at least post the source that reproduces
    this we could take a look at it.
    markspace, Nov 2, 2009
    #2
    1. Advertising

  3. G.W. Lucas

    G.W. Lucas Guest

    On Nov 2, 12:37 pm, markspace <> wrote:

    > [snip] Can you produce a complete example?



    Sure. The code follows. The actual application is generating content
    based on data analysis performed in background threads (behind the
    scenes) and storing it in BufferedImage objects. There are a number of
    different kinds of analysis routines running (some of them quite
    complicated). At run time, several images may get overlaid on top of
    each other to make make one composite. This, the images are declared
    TYPE_INT_ARGB when the application creates them. So far, I have been
    favorably impressed by Java's performance.

    Anyway, you can see where I don't think that asynchronous behavior is
    the issue, though I could certainly be wrong.

    This snippet of code simulates the conditions under which I am
    observing the performance issues. Running this on my computer, I
    observed 200 millisecond access time. I set the command-line memory
    option to -Xmx512m to ensure plenty of memory for the test.

    Hope this helps.

    Gary



    import java.awt.image.BufferedImage;

    /**
    *
    */
    public class TimeTest {

    public static void main(String args[]) {
    TimeTest test = new TimeTest();
    for(int i=0; i<5; i++)
    test.run();
    }

    public void run() {
    BufferedImage image = new BufferedImage(
    2000,
    1500,
    BufferedImage.TYPE_INT_ARGB);

    // Diagnostic graphics operations to exercise the
    // idea that some work had been done before we tried
    // extracting the contents of the image. In testing,
    // doesn't seem to make much difference.
    //Graphics2D g2d = image.createGraphics();
    //g2d.setColor(Color.ORANGE);
    //g2d.drawRect(0, 0, 500, 600);
    //g2d.dispose();
    //try {
    // Thread.sleep(1000);
    //} catch (InterruptedException ex) {
    //}

    long time0 = System.currentTimeMillis();
    int w = image.getWidth();
    int h = image.getHeight();
    int n = w * h;
    int[] rgb = image.getRGB(0, 0, w, h, new int[n], 0, w);
    long time1 = System.currentTimeMillis();
    long accessTime = time1 - time0;
    System.out.println("Access time " + accessTime);
    System.out.flush();
    }
    }






    > G.W. Lucas wrote:
    > > Anyway, I added some more instrumentation to my time measurements and
    > > realized that of that 250 milliseconds, 200 or so was happening in the
    > > BufferedImage.getRGB() method that I was using to extract the raw data
    > >         int[] rgb = image.getRGB(0, 0, w, h, new int[n], 0, w);

    >
    > I'm not sure, but several of the Java image APIs are asynchronous.  They
    > return immediately and load in the background.  You're getting the whole
    > image here so it's possible that read() is waiting on IO to complete, I
    > suppose.
    >
    > Can you produce a complete example?  I realize that you can't really
    > send an image, but if you could at least post the source that reproduces
    > this we could take a look at it.
    G.W. Lucas, Nov 2, 2009
    #3
  4. G.W. Lucas

    markspace Guest

    G.W. Lucas wrote:
    > This snippet of code simulates the conditions under which I am
    > observing the performance issues. Running this on my computer, I
    > observed 200 millisecond access time. I set the command-line memory
    > option to -Xmx512m to ensure plenty of memory for the test.


    > int[] rgb = image.getRGB(0, 0, w, h, new int[n], 0, w);



    This line of code calls getRBG(), which basically does this (cut and
    paste from the source code itself):

    for (int y = startY; y < startY+h; y++, yoff+=scansize) {
    off = yoff;
    for (int x = startX; x < startX+w; x++) {
    rgbArray[off++] = colorModel.getRGB(
    raster.getDataElements(x,y,data));
    }
    }

    Copying data like this is never going to be fast. You want the internal
    buffer itself, probably, not a copy, so your operations on the data will
    be fast.

    So I think the trick is to use a different constructor, so that you
    already have access to the internal argb array. There's no way to just
    get a pointer to it in the BufferedImage API, that I can see.

    Now: are you loading data from disk? Or are you creating the data
    wholesale inside your program, as you example seems to imply? The
    answer is different depending on what you are doing.
    markspace, Nov 2, 2009
    #4
  5. In article <hcnd3b$fpk$-september.org>,
    markspace <> wrote:

    > G.W. Lucas wrote:
    > > This snippet of code simulates the conditions under which I am
    > > observing the performance issues. Running this on my computer, I
    > > observed 200 millisecond access time. I set the command-line memory
    > > option to -Xmx512m to ensure plenty of memory for the test.

    >
    > > int[] rgb = image.getRGB(0, 0, w, h, new int[n], 0, w);

    >
    >
    > This line of code calls getRBG(), which basically does this (cut and
    > paste from the source code itself):
    >
    > for (int y = startY; y < startY+h; y++, yoff+=scansize) {
    > off = yoff;
    > for (int x = startX; x < startX+w; x++) {
    > rgbArray[off++] = colorModel.getRGB(
    > raster.getDataElements(x,y,data));
    > }
    > }
    >
    > Copying data like this is never going to be fast. You want the
    > internal buffer itself, probably, not a copy, so your operations on
    > the data will be fast.
    >
    > So I think the trick is to use a different constructor, so that you
    > already have access to the internal argb array. There's no way to
    > just get a pointer to it in the BufferedImage API, that I can see.
    >
    > Now: are you loading data from disk? Or are you creating the data
    > wholesale inside your program, as you example seems to imply? The
    > answer is different depending on what you are doing.


    It should be possible to operate on the Raster directly:

    <http://sites.google.com/site/drjohnbmatthews/raster>

    Of course, that still leaves leaves 2000 x 1500 pixels work on.

    --
    John B. Matthews
    trashgod at gmail dot com
    <http://sites.google.com/site/drjohnbmatthews>
    John B. Matthews, Nov 2, 2009
    #5
  6. G.W. Lucas

    markspace Guest

    John B. Matthews wrote:

    >
    > It should be possible to operate on the Raster directly:
    >
    > <http://sites.google.com/site/drjohnbmatthews/raster>
    >
    > Of course, that still leaves leaves 2000 x 1500 pixels work on.



    I came up with the code below as a direct replacement for the OP's
    example. However, I'm not really sure if this is what he wants or not.
    It does run 10x faster than his example.


    public class Main
    {
    public static void main( String[] args )
    {
    for( int i = 0; i < 5; i++ ) {
    long startTime = System.nanoTime();

    SinglePixelPackedSampleModel neoSPPSM =
    new SinglePixelPackedSampleModel(
    DataBuffer.TYPE_INT, 2000,
    1500,
    new int[]{0xFF0000, 0xFF00, 0xFF, 0xFF000000} );
    int[] rawARGB = new int[2000 * 1500];
    DataBufferInt neoDBI = new DataBufferInt( rawARGB, 2000 *
    1500 );

    WritableRaster neoWR =
    Raster.createWritableRaster( neoSPPSM, neoDBI,
    null );
    DirectColorModel dcm = new DirectColorModel( 32, 0xFF0000,
    0xFF00,
    0xFF, 0xFF000000 );
    BufferedImage image2 = new BufferedImage( dcm, neoWR,
    false, null );
    long endTime = System.nanoTime();
    System.out.println( "Loop time (" + image2.hashCode() +
    "): " +
    (endTime - startTime) / 1000000 );
    }
    }
    markspace, Nov 2, 2009
    #6
  7. G.W. Lucas

    G.W. Lucas Guest

    Thank you. That's truly impressive. Of course, now I have about six
    new things to learn :)

    Looking at your earlier post, I can see where invoking getRGB 3
    million times might result in sub-optimal performance. I must say that
    I'm a little surprised to see something like that in a core API. But
    it explains a lot.

    I modified my test program based on your example, drawing a few
    primitives to the BufferedImage after it was created and inspecting
    the contents of the rawARGB array to see if it changed. It worked
    like a charm. And, as you say, the speed improvement is easily a
    factor of 10.

    I do have a question. From your code, it looks like the trick is to
    supply the BufferedImage constructor with the memory that you want it
    to write to so that you don't have to ask for it later on. That makes
    sense. The thing I was wondering about is if the resulting
    BufferedImage will have the same performance as the ones which I am
    currently creating with the less-advanced constructor. The real core
    function of my application is in the rendering of graphics (using
    Graphics2D) to produce the raw images. The rendering involves a lot of
    graphics primitives, so performance is critical. The image processing
    that I am doing is really just an extra.

    Naturally, I'm going to do my homework (read up on the API elements
    you used and also do a lot of testing) before I commit to an approach,
    but I was wondering whether I should be alert to any special tricks or
    techniques as I do so.

    Thanks again for your well-informed and insightful suggestion!

    Gary

    On Nov 2, 5:30 pm, markspace <> wrote:
    > John B. Matthews wrote:
    >
    > > It should be possible to operate on the Raster directly:

    >
    > > <http://sites.google.com/site/drjohnbmatthews/raster>

    >
    > > Of course, that still leaves leaves 2000 x 1500 pixels work on.

    >
    > I came up with the code below as a direct replacement for the OP's
    > example.  However, I'm not really sure if this is what he wants or not.
    >   It does run 10x faster than his example.
    >
    > public class Main
    > {
    >      public static void main( String[] args )
    >      {
    >          for( int i = 0; i < 5; i++ ) {
    >              long startTime = System.nanoTime();
    >
    >              SinglePixelPackedSampleModel neoSPPSM =
    >                      new SinglePixelPackedSampleModel(
    >                      DataBuffer.TYPE_INT, 2000,
    >                      1500,
    >                      new int[]{0xFF0000, 0xFF00, 0xFF, 0xFF000000} );
    >              int[] rawARGB = new int[2000 * 1500];
    >              DataBufferInt neoDBI = new DataBufferInt( rawARGB, 2000 *
    >                      1500 );
    >
    >              WritableRaster neoWR =
    >                      Raster.createWritableRaster( neoSPPSM, neoDBI,
    >                      null );
    >              DirectColorModel dcm = new DirectColorModel( 32, 0xFF0000,
    >                      0xFF00,
    >                      0xFF, 0xFF000000 );
    >              BufferedImage image2 = new BufferedImage( dcm, neoWR,
    >                      false, null );
    >              long endTime = System.nanoTime();
    >              System.out.println( "Loop time (" + image2.hashCode() +
    >                      "):  " +
    >                      (endTime - startTime) / 1000000 );
    >          }
    >
    > }
    >
    >
    G.W. Lucas, Nov 3, 2009
    #7
  8. G.W. Lucas

    markspace Guest

    G.W. Lucas wrote:
    > The thing I was wondering about is if the resulting
    > BufferedImage will have the same performance as the ones which I am
    > currently creating with the less-advanced constructor.



    I honestly don't know. I believe it will have the same performance,
    because I looked at the constructors for BufferedImage and related
    classes, and basically just did the same thing as they do.

    However, I haven't tested this yet, so you're going to be the first. If
    you don't see any speed drop right away, then I'm going to guess there
    there won't be any, because as I said I'm just doing the same thing that
    Sun's API does.

    Please do report back on your findings if you can. I'm interested if
    this technique is general and will work for other folks.
    markspace, Nov 3, 2009
    #8
  9. G.W. Lucas

    G.W. Lucas Guest

    On Nov 2, 8:02 pm, markspace <> wrote:
    > G.W. Lucas wrote:
    > > The thing I was wondering about is if the resulting
    > > BufferedImage will have the same performance as the ones which I am
    > > currently creating with the less-advanced constructor. [snip]

    >
    > Please do report back on your findings if you can.  I'm interested if
    > this technique is general and will work for other folks.


    Fortunately, I've always been interested in performance considerations
    (who isn't?), so I've got plenty of timing instrumentation
    already in place for my application. I was
    able to run the program a dozen or so times alternating between
    the different constructors. Although the sample set was pretty small,
    I'm pretty sure there are no statistically significant difference
    in the time required to build images (for a while there, I actually
    thought the WritableRaster constructor might have an edge, but
    that was just a result of a noisy test environment).

    Thanks again for all your help.

    g.
    G.W. Lucas, Nov 3, 2009
    #9
  10. In article
    <>,
    "G.W. Lucas" <> wrote:

    > On Nov 2, 8:02 pm, markspace <> wrote:
    > > G.W. Lucas wrote:
    > > > The thing I was wondering about is if the resulting BufferedImage
    > > > will have the same performance as the ones which I am currently
    > > > creating with the less-advanced constructor. [snip]

    > >
    > > Please do report back on your findings if you can.  I'm interested
    > > if this technique is general and will work for other folks.

    >
    > Fortunately, I've always been interested in performance
    > considerations (who isn't?), so I've got plenty of timing
    > instrumentation already in place for my application. I was able to
    > run the program a dozen or so times alternating between the different
    > constructors. Although the sample set was pretty small, I'm pretty
    > sure there are no statistically significant difference in the time
    > required to build images (for a while there, I actually thought the
    > WritableRaster constructor might have an edge, but that was just a
    > result of a noisy test environment).
    >
    >Thanks again for all your help.


    Interesting; thank you for reporting your results. I missed the import
    of markspace's suggestion: it was predicated on the reasonable
    assumption that array access would be faster than method invocation. My
    experience with mixing WritableRaster and Graphics2D operations is that
    the latter tend to dominate and the former are fast enough. Still, 'it's
    interesting to see how to construct a BufferedImage with one's own data
    buffer.

    --
    John B. Matthews
    trashgod at gmail dot com
    <http://sites.google.com/site/drjohnbmatthews>
    John B. Matthews, Nov 3, 2009
    #10
  11. G.W. Lucas

    markspace Guest

    John B. Matthews wrote:
    > In article
    > <>,
    > "G.W. Lucas" <> wrote:
    >
    >> Although the sample set was pretty small, I'm pretty
    >> sure there are no statistically significant difference in the time
    >> required to build images



    I'd also like to thank you for reporting these results. It's good to
    know that there weren't some other gotchas waiting for you further down
    the road.

    > Interesting; thank you for reporting your results. I missed the import
    > of markspace's suggestion: it was predicated on the reasonable
    > assumption that array access would be faster than method invocation.



    Not quite. I was basing by prediction on the idea that buffer copies
    should be avoided. In other words, the method call I focused on didn't
    just use setters and getters, it copied the entire 3,000,000 word pixel
    buffer, before handing the copy to the caller. This also means that the
    3,000,000 buffer gets allocated twice. Once when the BufferedImage is
    created, and once again when the OP had to allocate a second buffer.

    Both of these operations are avoided in the code I posted. The buffer
    is allocated once, and never copied.

    Modern CPUs impose a high penalty for large numbers of consecutive reads
    and writes. In typical algorithm analysis, all reads and writes are
    assumed to be the same value. However, this doesn't work for long
    strings of consecutive reads and writes, because they can't be cached,
    and therefore don't benefit from locality of access the way that other
    reads and writes do.

    In other words, most memory access have a lower amortized access time,
    due to locality and the CPU cache. Memory that is access precisely once
    doesn't benefit from this amortized time, and has to pay the full cost
    of a cache miss, main-memory access, and then the eventual main-memory
    write. A long string of such accesses is particularly painful.

    If that's all too much to remember, then just remember that "buffer
    copies are bad" and go with that.

    > My
    > experience with mixing WritableRaster and Graphics2D operations is that
    > the latter tend to dominate and the former are fast enough. Still, 'it's
    > interesting to see how to construct a BufferedImage with one's own data
    > buffer.



    It would be interesting to compare the method calls in a writable raster
    with direct buffer access, like the OP was doing. I suspect they are
    similar and the performance hit using method calls vs. direct access
    isn't as large as most folks would believe. However, the OP wanted a
    raw array, so that's what I gave him.
    markspace, Nov 3, 2009
    #11
  12. G.W. Lucas

    Daniel Pitts Guest

    G.W. Lucas wrote:
    > On Nov 2, 8:02 pm, markspace <> wrote:
    >> G.W. Lucas wrote:
    >>> The thing I was wondering about is if the resulting
    >>> BufferedImage will have the same performance as the ones which I am
    >>> currently creating with the less-advanced constructor. [snip]

    >> Please do report back on your findings if you can. I'm interested if
    >> this technique is general and will work for other folks.

    >
    > Fortunately, I've always been interested in performance considerations
    > (who isn't?), so I've got plenty of timing instrumentation
    > already in place for my application. I was
    > able to run the program a dozen or so times alternating between
    > the different constructors. Although the sample set was pretty small,
    > I'm pretty sure there are no statistically significant difference
    > in the time required to build images (for a while there, I actually
    > thought the WritableRaster constructor might have an edge, but
    > that was just a result of a noisy test environment).
    >
    > Thanks again for all your help.
    >
    > g.

    I hope you *also* use a profiler :)
    Daniel Pitts, Nov 3, 2009
    #12
  13. G.W. Lucas

    Guest

    On Nov 2, 2:55 pm, "G.W. Lucas" <> wrote:
    > I have an application that performs some specialized image-processing
    > which is simple, but not supported by the JAI or other Java API.


    And I take it it's not supported either by 3D hardware-accelerated
    APIs? Because if it's supported by such hardware, then you can
    bypass the entire software stack and the gains are expressed in
    orders of magnitude (even when called from Java :)
    , Nov 3, 2009
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    382
  2. =?Utf-8?B?cmdoYXRvbA==?=

    Re: fastest way to access global vars

    =?Utf-8?B?cmdoYXRvbA==?=, Mar 5, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    369
  3. Abrahams, Max
    Replies:
    3
    Views:
    276
    Nick Craig-Wood
    Feb 5, 2008
  4. Dan Moskowitz

    Fastest way to tint an image with PIL

    Dan Moskowitz, Nov 6, 2008, in forum: Python
    Replies:
    2
    Views:
    650
    Marc 'BlackJack' Rintsch
    Nov 7, 2008
  5. Jack
    Replies:
    2
    Views:
    196
    John W. Krahn
    Jan 25, 2008
Loading...

Share This Page