Stratified random sampling with random_shuffle ?

S

steflhermitte

Dear cpp-ians,

I want to apply a stratified sampling on an image. Following simplified
example will explain my problem.

The original image I with nrows and ncols is now a vector V of length
(nrow x ncol) and every element of the vector contians its pixel value.

vector float V [nrow * ncol] ;

This means that:
- the first ncol elements = first row of my image
- the elements between ncol and 2*ncol = the second row of my image
and so on ...

Now I want to select a stratified pixel of this vector and sample every
pixel till all pixels are sampled once. In the image it would
correspond to divide my image into small blocks and take a pixel of
each block iteratively till all pixels are done.

If I random_shuffle my vector and start at the beginning V till the end
I perform a random sampling, but I now want to add a certain order in
it so it corresponds to stratified random sampling. Is there anyone
with ideas on how I should do this?

Thanx in advance and kind regards,
Stef
 
K

Kanenas

Dear cpp-ians,
[...]
The original image I with nrows and ncols is now a vector V of length
(nrow x ncol) and every element of the vector contians its pixel value.

vector float V [nrow * ncol] ;
[...]
Now I want to select a stratified pixel of this vector and sample every
pixel till all pixels are sampled once. In the image it would
correspond to divide my image into small blocks and take a pixel of
each block iteratively till all pixels are done.

If I random_shuffle my vector and start at the beginning V till the end
I perform a random sampling, but I now want to add a certain order in
it so it corresponds to stratified random sampling. Is there anyone
with ideas on how I should do this?
As random_shuffle will completely re-order the vector, the original
neighbor relation will be completely lost. I suggest creating a
"StratifiedVector", whose indexing operator returns a
StratifiedVector::Block, which could represent the sequence of
elements in a block from the original vector (like a gslice_array).

template <typename _Scalar,
typename _Container=std::vector<_Scalar> >
class StratifiedVector {
public:
StratifiedVector(_Container&,
size_t block_width, size_t block_height);
//define class Block, iterator &c.
...
};

You could then apply random_shuffle to each block. Alternatively,
define Block::sample() which returns a random element of a Block; if
you don't want repeats, Block::sample could internally use a random
sequence of indices. Depending on how you implemented
StratifiedVector, every block could use have their own sequence for
Block::sample or share a common one. Shuffling within each block will
produce a random sequence for each block but preserve the
stratification. A random shuffling of a StratifiedVector will produce
a random sequence of blocks and thus also preserve stratification.

You could then have two loops, the outer over block indexes and the
inner over the (random) sequence of blocks. Within the inner loop,
take the pixel within the current block. Note that with this
approach, having a different random sequence of blocks for each inner
loop is as expensive as shuffling V (which is what would be
happening). Example:

StratifiedVector<float> StratV(V, block_width, block_height);
//following shuffles could be done in StratifiedVector's
//constructor. If you provide a Block::sample, shuffling within
//blocks is unnecessary.
for (size_t i=0; i < StratV.size(); ++i) {
random_shuffle(StratV.begin(),
StratV.end());
}
random_shuffle(StratV.begin(), StratV.end());

size_t block_size=block_width*block_height;
for (size_t pixel_i=0; pixel_i < block_size; ++pixel_i) {
/* if you want a different sequence of blocks for each inner loop,
perform random_shuffle here rather than before the outer loop.
Expensive. */
//random_shuffle(StratV.begin(), StratV.end());
for (size_t block_i=0; block_i < StratV.size(); ++block_i)
{
/*Test is for blocks in the last row or column. If block_height
divides nrow and block_width divides ncol, the test can be
skipped.*/
if (pixel_i < StratV[block_i].size()) {
foo(StratV[block_i][pixel_i]);

/*self-sampling version; Block::sample() returns a random
element. If you use Block::sample, remove the above loop
which shuffles each block */
//foo(StratV[block_i].sample());
}
}
}

Alternatively, you could shuffle sequences of indices rather than
blocks or pixels. Producing different random sequences of blocks is
less expensive with this approach, but the extra index vectors are
more space expensive and indirection through them is more time
expensive. Also, each block must use the same random index sequence.
Example:

StratifiedVector<float> StratV(V, block_width, block_height);
vector<size_t> SV_idxes(StratV.size());
for (sizt_t i=0; i<SV_idxes.size(); ++i) {
SV_idxes=i;
}
/* could also shuffle SV_idxes inside the outer 'for' loop */
random_shuffle(SV_idxes.begin(), SV_idxes.end());
vector<size_t> Pixel_idxes(block_width * block_height);
for (sizt_t i=0; i<Pixel_idxes.size(); ++i) {
Pixel_idxes=i;
}
random_shuffle(Pixel_idxes.begin(), Pixel_idxes.end());
for (size_t pixel_i=0; pixel_i < block_size; ++pixel_i) {
/* if you want a different sequence of blocks for each inner loop,
perform random_shuffle here rather than after creating
SV_idxes. */
//random_shuffle(SV_idxes.begin(), SV_idxes.end());
for (size_t block_i=0; block_i < SV_idxes.size(); ++block_i)
{
/*Test is for blocks in the last row or column. If block_height
divides nrow and block_width divides ncol, the test can be
skipped.*/
if (Pixel_idxes[pixel_i] < StratV[SV_idxes[block_i]].size()) {
foo(StratV[SV_idxes[block_i]][Pixel_idxes[pixel_i]]);

/* Self-sampling version. If you use Block::sample(), remove
Pixel_idexes and loop which shuffles it. */
//foo(StratV[block_i].sample());
}
}
}


You could also combine the two approaches by shuffling within each
block and shuffling indices of StratifiedVector. This way, each block
will have a different random sequence and shuffling the sequence of
blocks isn't too expensive.

Note StratV.size() equals:
ceil(static_cast<float>(nrow)/block_height)
*ceil(static_cast<float>(ncol)/block_width);
If you know block_width and block_height will always be divisors of
nrow and ncol, you can use solely integer arithmetic in
StratifiedVector::size().

Kanenas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top