splitting an array.

P

pereges

I've an array :

{100,20, -45 -345, -2 120, 64, 99, 20, 15, 0, 1, 25}

I want to split it into two different arrays such that every number <=
50 goes into left array and every number > 50 goes into right array.
I've done some coding but I feel this code is very inefficient:

void split_array(int *a, int size_of_array)
{
/* a is the pointer to the array which is going to be partitioned */
int i, left_size =0, right_size = 0;

int *b, *c /* pointers to new arrays */

for(i =0; i< size_of_array; i++)
{
if(a <= 50)
left_size++;
if(a > 50)
right_size++;
}

b = calloc(sizeof(*b) * left_size);
c = calloc(sizeof(*c) * right_size);

if( b == NULL || c == NULL)
{
fprintf(stderr, "memory allocation failure: %s %d %s", __FILE__,
__LINE__, __func__);
exit(EXIT_FAILURE);
}

left_size = right_size = 0;

for(i =0; i< size_of_array; i++)
{
if(a <= 50)
{
b[left_size] = a;
left_size++;
}
if(a > 50)
{
c[right_size] = a;
right_size++;
}
}

exit(EXIT_SUCCESS);

}

I'm really not comfortable with running similar for loops two times.
Is this bad programming ?
 
K

Keith Thompson

pereges said:
I've an array :

{100,20, -45 -345, -2 120, 64, 99, 20, 15, 0, 1, 25}

I want to split it into two different arrays such that every number <=
50 goes into left array and every number > 50 goes into right array.
I've done some coding but I feel this code is very inefficient:

Apart from a few minor points, it looks ok to me. Traversing the
array twice isn't a big deal. If you had a requirement to traverse it
only once, you could realloc() the new arrays as they fill up, or make
each one bigger than it needs to be initially (the same size as your
original array) and then perhaps shrink them with realloc. I can't
think of a method that's significantly more efficient than what you've
written.
void split_array(int *a, int size_of_array)

I'd probably make size_of_array a size_t, but int is ok if the array
length can't exceed INT_MAX. But "size" usually means the size in
bytes, as in "sizeof"; I'd call the second parameter length_of_array,
or perhaps just len.
{
/* a is the pointer to the array which is going to be partitioned */
int i, left_size =0, right_size = 0;

int *b, *c /* pointers to new arrays */

for(i =0; i< size_of_array; i++)
{
if(a <= 50)
left_size++;
if(a > 50)
right_size++;


The second test is unnecessary.
}

b = calloc(sizeof(*b) * left_size);
c = calloc(sizeof(*c) * right_size);

calloc() takes two arguments. Either you left out the required
"#include <stdlib.h>", or you're posting code that doesn't match what
you actually compiled. Copy-and-paste, don't re-type.

calloc() zeros the allocated memory, which is a waste of time here.
if( b == NULL || c == NULL)
{
fprintf(stderr, "memory allocation failure: %s %d %s", __FILE__,
__LINE__, __func__);
exit(EXIT_FAILURE);
}

left_size = right_size = 0;

for(i =0; i< size_of_array; i++)
{
if(a <= 50)
{
b[left_size] = a;
left_size++;
}
if(a > 50)


Again, the second test is unnecessary.
{
c[right_size] = a;
right_size++;
}
}

exit(EXIT_SUCCESS);

}

I'm really not comfortable with running similar for loops two times.
Is this bad programming ?


It's not uncommon to have to traverse an array once to gather
information needed for a second traversal.

I've thought of a tricky approach that requires 1.5 traversals on
average, but it's not significantly better than yours. Create a
single target array of the same size as ``a''. Copy values <= 50
starting at the beginning, and values > 50 starting at the end. When
you're done, reverse the second portion of the array. (This isn't
needed if you don't care about the order; you didn't say whether
that's a requirement or not.)

If the order doesn't matter, you can do the split in the original
array (assuming you don't care about keeping the original data; again,
you didn't say whether that's a requirement). This is basically a
single partition step of the Quicksort algorithm.
 
K

Keith Thompson

pete said:
/* BEGIN new.c output */

original array:
100 20 -45 -345 -2 120 64 99 20 15 0 1 25

left array:
-345 -45 -2 0 1 15 20 20 25

right array:
64 99 100 120

The OP's solution had the left and right arrays with the elements in
the same order as in the original array. Yours doesn't do this. We
don't know whether that's a requirement or not.
/* END new.c output */



/* BEGIN new.c */
[...]
#include <stdio.h>
#include <stdlib.h>
[...]
int main(void)
{
size_t count;
int array[] = {100,20,-45,-345,-2,120,64,99,20,15,0,1,25}; [...]
qsort(array, sizeof array / sizeof *array, sizeof *array, compar); [...]
/* END new.c */

The OP's solution was O(N). Yours is most likely O(N log N), assuming
a typical qsort() implementation.
 
M

MJ_India

I've an array :

{100,20, -45 -345, -2 120, 64, 99, 20, 15, 0, 1, 25}

I want to split it into two different arrays such that every number <=
50 goes into left array and every number > 50 goes into right array.
I've done some coding but I feel this code is very inefficient:
. . .
I'm really not comfortable with running similar for loops two times.
Is this bad programming ?

1. Take startIndex = 0, endIndiex = sizeof(array) - 1;
2. Perform steps 2.1 and 2.2 in loop while startIndex < endIndex
2.1 if array[startIndex] <= 50, startIndex++
2.2 else exchange(array + startIndex, array + (endIndex++))
3. left = array, right = array + endIndex

Implementation is left to you. Moreover this is more an algorithm
question than a C question. I am afraid it was asked in wrong forum.
 
W

Willem

pereges wrote:
) I've an array :
)
) {100,20, -45 -345, -2 120, 64, 99, 20, 15, 0, 1, 25}
)
) I want to split it into two different arrays such that every number <=
) 50 goes into left array and every number > 50 goes into right array.
) I've done some coding but I feel this code is very inefficient:
)
<snip code>: first calculate sizes of arrays, allocate, and copy.
)
) I'm really not comfortable with running similar for loops two times.
) Is this bad programming ?

As others have pointed out, it is only slightly inefficient.

But it rather depends on the exact requirements of the function.
For example:
- Is it required that the function return two malloc()ed pointers ?
(That is, two pointers that can both be free()d ?)
- Is it required that the order of the items is retained ?
- How large are the lists in practise, and is it important that no
memory is wasted ?

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
P

pereges

Well, I had given an example of a more general case where the array is
split using some random value. But what if I want to split the array
using the median of the array list in such way that all the elements
<= median go into a left array and all elements > median go into the
right array. It is necessary to create two different arrays in my
function and for that I need to know the max size for each array. To
find the median, you obviously need to sort it.
 
B

Ben Bacarisse

1. Take startIndex = 0, endIndiex = sizeof(array) - 1;
2. Perform steps 2.1 and 2.2 in loop while startIndex < endIndex
2.1 if array[startIndex] <= 50, startIndex++
2.2 else exchange(array + startIndex, array + (endIndex++))

Presumably you intended to write endIndex--.
 
B

Ben Bacarisse

Keith Thompson said:
Richard Heathfield said:
Keith Thompson said:

if(a <= 50)
left_size++;
if(a > 50)
right_size++;

The second test is unnecessary.


s/is unnecessary/can be replaced by else/

<snip>


Right.


I prefer your correction because the whole second test *is*
unnecessary. The OP needs to write 'size_of_array - left_size' in the
allocation but that is all. At first reading I thought that was what
you intended.
 
W

Willem

pereges wrote:
) Well, I had given an example of a more general case where the array is
) split using some random value. But what if I want to split the array
) using the median of the array list in such way that all the elements
)<= median go into a left array and all elements > median go into the
) right array. It is necessary to create two different arrays in my
) function and for that I need to know the max size for each array. To
) find the median, you obviously need to sort it.

- If you want to split on the median, then you know the size of the two
arrays beforehand.

- Finding the median can be done in O(N) time theoretically but that
has a lot of overhead.

- If you want to split on some given value and you have to have two
malloc()ed pointers that can be freed, you have no choice but to do
two passes.
However: If all you need are two pointers to memory with the resulting
two arrays, but it is not needed for the second array to be free()able,
then you can malloc() one array which will hold the two results, and
fill it from both ends, as suggested elsethread.
In other words: 'It is necessary to create two different arrays' is not
a good enough description of the requirements.

What is it actually that you are trying to do ? What is the function for ?


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
B

Ben Bacarisse

Willem said:
pereges wrote:
) ... It is necessary to create two different arrays in my
) function and for that I need to know the max size for each array.
- If you want to split on some given value and you have to have two
malloc()ed pointers that can be freed, you have no choice but to do
two passes.

A bit extra: "and you want the arrays to be as small as possible". If
two independently free-able arrays are required, they could both be
allocated the same size as the original.
What is it actually that you are trying to do ? What is the
function for ?

Seconded.
 
R

Richard Tobin

To find the median, you obviously need to sort it.

You can find the median using a quicksort-like algorithm in which you
only bother to "sort" the partition containing the median (e.g. if
you initially split 20 items into 8 and 12 you know it's in the 12).
This is O(N).

-- Richard
 
J

James Dow Allen

... what if I want to split the array
using the median of the array list ... To
find the median, you obviously need to sort it.

There is an *expected-time* O(N) median algorithm
that will be just what you want. The overhead
work done by the median finder will be precisely
the array splitting you want to do anyway!
The algorithm is simply quicksort except that
you needn't actually do the subsorts.

Hope this helps.
James Dow Allen
 
B

Bartc

MJ_India said:
I've an array :

{100,20, -45 -345, -2 120, 64, 99, 20, 15, 0, 1, 25}

I want to split it into two different arrays such that every number
<= 50 goes into left array and every number > 50 goes into right
array. I've done some coding but I feel this code is very
inefficient: . . .
I'm really not comfortable with running similar for loops two times.
Is this bad programming ?

1. Take startIndex = 0, endIndiex = sizeof(array) - 1;
2. Perform steps 2.1 and 2.2 in loop while startIndex < endIndex
2.1 if array[startIndex] <= 50, startIndex++
2.2 else exchange(array + startIndex, array + (endIndex++))
3. left = array, right = array + endIndex

Implementation is left to you. Moreover this is more an algorithm
question than a C question. I am afraid it was asked in wrong forum.

As specified it seemed the OP wanted two new arrays from the original data.

In this case, because the sizes are initially unknown, then it does become a
C question: how to allocate arrays without doing an extra pass through the
data. These little details are important in C, and once sorted the solution
is likely to be one of the fastest.

Otherwise the solution is trivial in some languages. Here's one I've just
tried:

data:=(100,20, -45, -345, -2, 120, 64, 99, 20, 15, 0, 1, 25)

a:=b:=()
forall x in data do
(x<50 | a | b) &:=x
end

println "Left =",a
println "Right =",b

And many will do it in one line I'm sure ("K" probably in a single
expression).

(I also ignored the new array requirement in my other post, but I'd also
missed the fact that everyone else also made use of sorting. Never mind..)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top