particle container in Java

B

bob

What is the best data structure to use for a particle container in
Java?

It needs the following ops to be fast:

insert

delete

iteration

It will be used for fx like explosions, smoke, and fire.
 
A

Andreas Leitgeb

bob said:
What is the best data structure to use for a particle container in
Java?
It needs the following ops to be fast:
insert
delete
iteration
It will be used for fx like explosions, smoke, and fire.

For "insert" and "delete" it depends by what criteria the
position would be determined.

Maybe LinkedList does it, if deletion and insertion happen
at iterator-positions already obtained.

If you need to "delete by object", then TreeSet might be
worth considering.
 
R

Roedy Green

What is the best data structure to use for a particle container in
Java?

Possibly a hanging moss structure. See
http://mindprod.com/jgloss/hangingmoss.html

I assume your problem is you have objects larger than a pixel that
move around in some pseudorandom way and you have to keep computing
their new positions and rendering the result?

The advantage of hanging moss is you can find objects quickly within a
region, and you can rapidly find objects close to a given one -- so
can do collision-rebound logic.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and
faster each year, but the focus on money makes software manufacturers
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore
ignored. The manufacturer focuses on cheap gimicks like dancing paper
clips to dazzle naive first-time buyers. The needs of existing
experienced users are almost irrelevant. I see software rental as the
best remedy.
 
B

BGB

Possibly a hanging moss structure. See
http://mindprod.com/jgloss/hangingmoss.html

I assume your problem is you have objects larger than a pixel that
move around in some pseudorandom way and you have to keep computing
their new positions and rendering the result?

The advantage of hanging moss is you can find objects quickly within a
region, and you can rapidly find objects close to a given one -- so
can do collision-rebound logic.

yep...

in my 3D engine (mostly written in plain C) I use a BSP-tree style
structure.


in this case, it will form an approximate binary tree of all of the
particles (via an estimated centroid and balance normal, which is used
to calculate the current division plane), and will occasionally
re-balance the tree as new particles spawn and die.

the tree can also serve to sort the particles (more or less) from
back-to-front, allowing drawing them with Z-buffering disabled and
getting the desired blending effects (if one draws alpha-blended
particles with Z-buffering enabled, they may clip in nasty looking ways,
so best results are to disable Z-buffering).

(sadly, particles/models/scene-geometry are not themselves interlaced
and drawn in proper back-to-front ordering, mostly creating issues with
alpha-blended objects/particles/sprites/... interacting badly with windows).


I had before experimented with adding inter-particle interactions
(attractive and repulsive forces), but this dropped the framerate too
much and caused lots of funky behavior, so generally particles don't
interact (most particle effects neither really need nor benefit from
inter-particle forces).

the issue is mostly the large numbers of particles needed to make things
like like fire effects, ... look decent.


I also have some particle effects which use fragment shaders (such as
trying to fake flowing water), but sadly particles+shaders does bad
things to the framerate (so, presently each is drawn as a largish disk).
it looked better with spheres, but this did too much damage to the
framerate.

sadly, making "realistic" flowing water which is more than simply a
visual effect (IOW: has physics, interacts with the player, ...) would
be a somewhat more complex problem.

the more "traditional" way of representing fluids is by using large
brushes/polyhedra representing the fluid (giant cubes of "water" and
similar). however, these can't "flow", and although particle-based water
can flow, it is much more computationally expensive and expensive to
render. in this case, fluids would likely exist as semi-solid
frictionless spheres.


or such...
 
G

Giovanni Azua

Hi,

What is the best data structure to use for a particle container in
Java?

It needs the following ops to be fast:
insert
delete
iteration

It will be used for fx like explosions, smoke, and fire.
Storing the particles is easy, the questions are:

1) how many particles: 10 to the what?
2) particle-particle interactions are O(N^2) so ...

you might want to have a look at the Multipole O(n log n) and Fast Multipole
O(n) methods where you reduce the interaction computation by operating
either at: Box-Particle or Box-Box interactions rather than always
Particle-Particle.

I implemented this in C++ as a homework couple of weeks ago for simulating
fluid dynamics. Basically you create a Quad Tree where every node contains
all Particles within its boundaries, then it will have 4 children Nodes that
contain the subset of Particles in those children Nodes and so on. You split
recursively until every leaf Node contains at most M Particles. Then to do
the simulation you traverse the Tree in pre- or post order and find out
whether you can optimize computation using Box-Box, Box-Particle, or
Particle-Particle interactions. You have to read the algorithm in detail.

Here you find the details and a sample implementation in C++ (HW solution
1):
http://cse-lab.ethz.ch/teaching/classes/mmc2011.html

Regarding the insert/deletes it is log_4 to insert/delete, you do a contains
check top-bottom starting from the root Node ... that's pretty fast I would
say.

HTH,
Best regards,
Giovanni
 
A

Arne Vajhøj

What is the best data structure to use for a particle container in
Java?

It needs the following ops to be fast:

insert

delete

iteration

It will be used for fx like explosions, smoke, and fire.

Some collection from java.util.

ArrayList, LinkedList or one of the other depending on specifics.

Arne
 
B

BGB

Hi,


Storing the particles is easy, the questions are:

1) how many particles: 10 to the what?

relevant question...

in my case, particles are generally in the 10^3 range (my 3D engine
currently has a hard-coded limit of 10k particles, but I am not sure how
many is "typical").

I am aware that probably around 1000-5000 particles are used for most
particle effects (such as explosions), where one will have maybe 2500
particles for the "fire and sparks", another 1000 or so for smoke
particles, ...

IIRC, flame particles are approx 3 inches, and smoke particles 5 inches.


IIRC, most fire sources generally emit 250 particles per second, with
particles having an approx 5 second lifespan (with a variance of +-3s),
so around 1250 for a typical fire-source (such as a torch).

there may be a number of such torches in a scene (however, a lot of
times many may start sputtering, indicating that likely the particle
limit is being reached).

2) particle-particle interactions are O(N^2) so ...

in a naive case, yes.
with a BSP or similar it is around O(n log2 n).

however, when n=10k, this may still put a dent in the framerate.


it is also like when writing rigid body physics code:
the vast majority of the time is not often going into actual "physics
stuff", so much as it is trying to cull away everything that is not
moving and can't possibly collide.

it is sad in a way:
3D stuff is a good way to show just how little CPU power and RAM one
really has sometimes.


granted, there is a little more fancy-looking stuff one can do when they
don't need to try keep things at 40-60 fps, or when one can do things
with the otherwise absurdly small scenes typical in a tech demo (for
example, being able to simulate a small box of water doesn't really mean
it will scale to a lake).

you might want to have a look at the Multipole O(n log n) and Fast Multipole
O(n) methods where you reduce the interaction computation by operating
either at: Box-Particle or Box-Box interactions rather than always
Particle-Particle.

I implemented this in C++ as a homework couple of weeks ago for simulating
fluid dynamics. Basically you create a Quad Tree where every node contains
all Particles within its boundaries, then it will have 4 children Nodes that
contain the subset of Particles in those children Nodes and so on. You split
recursively until every leaf Node contains at most M Particles. Then to do
the simulation you traverse the Tree in pre- or post order and find out
whether you can optimize computation using Box-Box, Box-Particle, or
Particle-Particle interactions. You have to read the algorithm in detail.

I use a BSP variant personally.

oct-tree would also make sense, but I have seen little to suggest that
an oct-tree is either particularly faster or simpler than a BSP.

IMHO, a quad-tree seems like a bad idea in 3D for managing particles (it
makes sense to separate them on all 3 axes if possible).

Here you find the details and a sample implementation in C++ (HW solution
1):
http://cse-lab.ethz.ch/teaching/classes/mmc2011.html

Regarding the insert/deletes it is log_4 to insert/delete, you do a contains
check top-bottom starting from the root Node ... that's pretty fast I would
say.

yeah, although IME, inserting/deleting is not so much the main time
eater, at least vs possible interactions between particles and/or
rendering them.


if I were doing a fluid, I would likely consider 16 or 32 inch water
particles. I don't think much smaller would be ideal for the framerate
(assuming non-trivial amounts of water).

probably the particles would try to maintain an equilibrium distance,
becoming attractive or repulsive depending on their variation from said
ideal distance.

also, probably they would need the ability to interact with scene
geometry, ...

but, it could be cool.


I guess the main issue is how many such particles one can reasonably
have in a scene while keeping a good frame-rate (I can't entirely
predict, not having tried doing this).
 
R

Roedy Green

in my 3D engine (mostly written in plain C) I use a BSP-tree style
structure.

I had never heard of Binary Space Partitioning trees before.
http://en.wikipedia.org/wiki/Binary_space_partitioning

I invented Hanging Moss myself back in the 1970s to optimise the
motions of pen-plotters. The problem was to rapidly find a nearby
thing to draw next. It was astoundingly faster that computing the
distance to everything else and finding the minimum.

It is just a classifying grid. At each square you have a chain of
objects. To find a point near another you chase the chain in the
appropriate square. If you don't find anything you search the next
outer ring of squares.

If you doubly link the chain, you can efficiently delete and insert.
If you singly link, you can do LIFO/FIFO.

--
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and
faster each year, but the focus on money makes software manufacturers
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore
ignored. The manufacturer focuses on cheap gimicks like dancing paper
clips to dazzle naive first-time buyers. The needs of existing
experienced users are almost irrelevant. I see software rental as the
best remedy.
 
B

BGB

I had never heard of Binary Space Partitioning trees before.
http://en.wikipedia.org/wiki/Binary_space_partitioning

I use BSP tree variants for lots of things (rendering, physics,
particles, scene-graph, ...).

early on (before about 2005 or so), I just used them mostly for
geometry, but then later discovered algorithms which allowed updating
and rebuilding them in real-time (or, IOW, a dynamic BSP), which I have
been making use of since (my 3D engine does not use a prebuilt BSP).

there are a few drawbacks (the BSPs are less optimal, thus far my 3D
engine doesn't do inter-brush clipping/CSG, ...), but in general it is a
reasonable tradeoff (not having to pre-build the scene allows a lot more
nifty stuff).

I invented Hanging Moss myself back in the 1970s to optimise the
motions of pen-plotters. The problem was to rapidly find a nearby
thing to draw next. It was astoundingly faster that computing the
distance to everything else and finding the minimum.

yes, it would be...
a BSP is much faster than a linear search as well, FWIW.

another (common) option is oct-trees, but a drawback of oct-trees is
that one either needs a bounding-box for the region, or to decide upon a
maximum region size, before one can build an oct-tree.


FWIW, I did not exist in the 70s (my existence began in the 80s).

It is just a classifying grid. At each square you have a chain of
objects. To find a point near another you chase the chain in the
appropriate square. If you don't find anything you search the next
outer ring of squares.

If you doubly link the chain, you can efficiently delete and insert.
If you singly link, you can do LIFO/FIFO.

so, like a chain-hash?...

I guess it could work.


I guess a downside of BSPs is that they can really be built
asynchronously and/or scaled to an arbitrary size (at least not with my
current algorithms).

a partial solution (in my 3D engine) may be (or may have been with) the
introduction of "regions" (loosely influenced by the regions in
Minecraft), which are currently cubes with a size of 16384 units (1.5
inches/unit) on a side (albeit they are only 4096 units tall). however,
currently the "region" system exists independently of the BSP (it mostly
manages voxel terrain and similar).

the idea would be to allow an arbitrarily large yet continuous world
(like that in Minecraft) with piecewise loading/unloading as the player
moves around the world.

as-is, there had been some work at trying to integrate voxels with the
pre-existing Quake-style worlds.
 
R

Roedy Green

so, like a chain-hash?...
there is no hashing in the sense that HashMap does. In hanging moss,
you select the grid cell by dividing x and y by the x and y grid
resolution giving two ints to index into an array of chains.

It is a similar concept to a HashMap. HashMap categories elements of
its big set into subcategories that it can linearly seach. I do the
same thing, except instead of hashing, I divide to home in on the
subcategory.

It has the same problem as your oct-tree in that you need a bounding
box to start.

I have added some comments to the entry at
http://mindprod.com/jgloss/hangingmoss.html


--
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and
faster each year, but the focus on money makes software manufacturers
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore
ignored. The manufacturer focuses on cheap gimicks like dancing paper
clips to dazzle naive first-time buyers. The needs of existing
experienced users are almost irrelevant. I see software rental as the
best remedy.
 
R

Roedy Green

I am aware that probably around 1000-5000 particles are used for most
particle effects (such as explosions), where one will have maybe 2500
particles for the "fire and sparks", another 1000 or so for smoke
particles, ...

are you doing this rendering in "real" time? In other words fast
enough for game rendering, or are you producing movie effects where
you can take days to produce a few seconds of rendering?
--
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and
faster each year, but the focus on money makes software manufacturers
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore
ignored. The manufacturer focuses on cheap gimicks like dancing paper
clips to dazzle naive first-time buyers. The needs of existing
experienced users are almost irrelevant. I see software rental as the
best remedy.
 
B

BGB

are you doing this rendering in "real" time? In other words fast
enough for game rendering, or are you producing movie effects where
you can take days to produce a few seconds of rendering?

real time...

more so, I try where possible to keep things in the 40-60fps range, and
try some to keep it still usable on older HW (a laptop of mine from
2003, and another laptop of mine from 2009 but with a lame Intel GPU,
although the frame-rates are not very good on these).

if I were doing it offline, I could have many more particles than this.


the basic strategy for making a particle explosion in my case is
basically to have lots of fire and smoke particles, which all "explode"
outward from a central point. most of these are small sprites.

most particles are also non-interacting.


for offline rendering, one would likely have much larger numbers of
particles.

this is also why one will limit the max number of particles in a scene,
as otherwise too make particle effects would hurt the framerate (as-is,
if there are too many particle effects, they will start sputtering).
the current limit is 10k particles, but IIRC this limit was set probably
back when I was using a Radeon 9200 or similar. on modern HW, a higher
limit would probably work, but I have no immediate need to change it.
 
R

Roedy Green

I try where possible to keep things in the 40-60fps range,

Another totally different approach is to fob the work off on the GPU.
I have not coded at low level for a very long time, so I don't know
what modern video cards can do, and how standard the APIs are.

But, it seem to me what you are doing is common to many games, and
surely modern video cards have some way of helping out. They have
some skookum processing power. I would think some parallel processing
hardware would really make this sort of thing fly.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Capitalism has spurred the competition that makes CPUs faster and
faster each year, but the focus on money makes software manufacturers
do some peculiar things like deliberately leaving bugs and deficiencies
in the software so they can soak the customers for upgrades later.
Whether software is easy to use, or never loses data, when the company
has a near monopoly, is almost irrelevant to profits, and therefore
ignored. The manufacturer focuses on cheap gimicks like dancing paper
clips to dazzle naive first-time buyers. The needs of existing
experienced users are almost irrelevant. I see software rental as the
best remedy.
 
G

Giovanni Azua

Hi Roedy,

Another totally different approach is to fob the work off on the GPU.
I have not coded at low level for a very long time, so I don't know
what modern video cards can do, and how standard the APIs are.

But, it seem to me what you are doing is common to many games, and
surely modern video cards have some way of helping out. They have
some skookum processing power. I would think some parallel processing
hardware would really make this sort of thing fly.
Absolutely :) when I saw this OP, the first thing that crossed my mind was
"why is he using Java for this ... ?". These are the reasons I would stay
away from Java and rather use C++ to do such a job:

1) Manual optimization for the memory hierarchies e.g. Scalar replacement;
optimizing loops with the level of unrolling that maximizes the cache hit
for the underlying system cache specification. This needs "manual
analysis and manual doing" since e.g. memory aliasing and function calls
prevent C++ compilers from doing such optimizations automatically. Same
concepts apply to CUDA 3) below.

2) Same reason as 1) to do manual vectorization i.e. SSE, SSE2, SSE3, SSE4
etc e.g.
float a[4] = {1.0, 2.0, 3.0, 4.0};
__m128 t = _mm_load_ps(a);
b = _mm_mul_ps(a, a);

3) CUDA nVidia C++ extensions that allows embedding GPU code and invoking it
from CPU. Here an example that matches the OP:
http://http.developer.nvidia.com/GPUGems3/gpugems3_ch31.html

Needs a lot of brainwork to make get it right i.e. To be able
to outperform CPU in numerical methods e.g. MMM, MVM.
Trying to implement a Fast Multipole with CUDA would be a daunting task
... but is not impossible:

<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCEQFjA
A&url=http%3A%2F%2Farxiv.org%2Fpdf%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
=AFQjCNGln1eVVemPaPZRP3RtOxKcYeNWjw&sig2=WONUfOoU5zSEYjzUfWLR3w>

<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCoQFjA
B&url=http%3A%2F%2Farxiv.org%2Fabs%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
=AFQjCNH8rO4vBk9E-p8W2Le11kDPiQYjkg&sig2=x0t31s43D22f6yN7cq8l6Q>

Best regards,
Giovanni
 
B

BGB

Hi Roedy,

except:
GPGPU requires newer cards (excludes older cards and many laptops and
entry-level PCs with "Intel GMA" GPUs or similar, ...);
doesn't really work as well when the card is already busy doing
"everything else";
it is at-present, a bit of a hassle;
....

more so, it is not so relevant when ones' engine is mostly getting
bogged down trying to shove stuff through the video card (in this case,
the raw performance on the CPU is not nearly as important...).

Absolutely :) when I saw this OP, the first thing that crossed my mind was
"why is he using Java for this ... ?". These are the reasons I would stay
away from Java and rather use C++ to do such a job:

1) Manual optimization for the memory hierarchies e.g. Scalar replacement;
optimizing loops with the level of unrolling that maximizes the cache hit
for the underlying system cache specification. This needs "manual
analysis and manual doing" since e.g. memory aliasing and function calls
prevent C++ compilers from doing such optimizations automatically. Same
concepts apply to CUDA 3) below.

above: probably overkill, FWIW...

though, in my case, pretty much all of my 3D rendering and "core
functionality" is mostly in C (a few things use C++, but these are
elsewhere in the project).

I also use Java and BGBScript (my own language) in a few places, but
this is not anywhere near the 3D renderer (both were used in my case
with the intention of being for "server side" functionality, which
mostly means AI/NPC behavior world entities/events and similar).

as-is, much of the "server side" functionality is still in C as well though.


a major downside of Java here was mostly that JNI is not terribly
usable, and that with a naive VM, it tends to spew a lot of garbage (not
so good if ones' GC is also not very good).

I was able to make my own language much nicer WRT cross-language
interfacing.

decided to leave out a glob of history here (interactions and
developments in my own project WRT the creation of VMs and relations
between programming languages...).

2) Same reason as 1) to do manual vectorization i.e. SSE, SSE2, SSE3, SSE4
etc e.g.
float a[4] = {1.0, 2.0, 3.0, 4.0};
__m128 t = _mm_load_ps(a);
b = _mm_mul_ps(a, a);

I also do this, but mostly wrapped the intrinsics to make them look more
like a nicer (GLSL-style) vector API (with fallback logic for if SSE
wasn't enabled as a compiler switch, where plain structs and scalar math
are used instead).

sadly, the way it was implemented doesn't currently allow operator
overloading (in C++), so I have to be more like:
vec3 u, v;
float f, g;
u=vec3(1,2,3,4);
v=v3mul(u, u);
f=v3dot(u, v);
g=v3dot(u, v3scale(v, f));
....

but, it works.

it is implemented as a mix of macros and "static inline" functions.


gradually, this is replacing my "older strategy", which was to represent
all vectors directly as arrays of floats, and perform operations by
passing pointers to these float arrays.

the new style can somewhat outperform the old style with SSE enabled,
but is slightly slower without SSE enabled (mostly due to the added
memory-copying in passing/returning structs). either way, it is more
convenient.


3) CUDA nVidia C++ extensions that allows embedding GPU code and invoking it
from CPU. Here an example that matches the OP:
http://http.developer.nvidia.com/GPUGems3/gpugems3_ch31.html

Needs a lot of brainwork to make get it right i.e. To be able
to outperform CPU in numerical methods e.g. MMM, MVM.
Trying to implement a Fast Multipole with CUDA would be a daunting task
... but is not impossible:

<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCEQFjA
A&url=http%3A%2F%2Farxiv.org%2Fpdf%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
=AFQjCNGln1eVVemPaPZRP3RtOxKcYeNWjw&sig2=WONUfOoU5zSEYjzUfWLR3w>

<http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCoQFjA
B&url=http%3A%2F%2Farxiv.org%2Fabs%2F1010.1482&ei=wZW2TpHAI4mJ4gTX5f37Aw&usg
=AFQjCNH8rO4vBk9E-p8W2Le11kDPiQYjkg&sig2=x0t31s43D22f6yN7cq8l6Q>

FWIW, there is also OpenCL.

OpenCL is to CUDA sort of like what OpenGL is to DirectX...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,786
Messages
2,569,626
Members
45,328
Latest member
66Teonna9

Latest Threads

Top