calculate an average with every data in an array

Discussion in 'Perl Misc' started by Erol Akman, Mar 4, 2009.

  1. Erol Akman

    Erol Akman Guest

    Hi,

    I have following (crazy) task and was hoping to solve it via perl, but
    I need your help, please.

    I have an array of values and need to calculate every possible average
    and sort it by the average data:

    @array = qw(4 6 2 7 1 8 3 5 9 10);


    This should be printed out:
    4+2 = 6, Average =3
    4+6+2 = 12, Average=4
    4+6 = 10, Average=5
    ....
    10+4+6+2+3 = 25, Average=5

    also nice to have, but not must have: I need to see which scalar perl
    used to sum up, maybe "4+6+2 = 12, Average=4, scalar 0,1,3 were summed
    up"

    I know, its nuts, but I really need this.

    Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    so on and hundreds of values like 54, 32, 45. I don't know which and
    how many of these values where summed up to get the data above. My
    idea was to calculate every possible average to find out which values
    where summed up and averaged out.

    Is this possible? Can you help me?

    Best regards
    Erol
    Erol Akman, Mar 4, 2009
    #1
    1. Advertising

  2. Erol Akman <> writes:

    > Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    > so on and hundreds of values like 54, 32, 45. I don't know which and
    > how many of these values where summed up to get the data above. My
    > idea was to calculate every possible average to find out which values
    > where summed up and averaged out.
    >
    > Is this possible? Can you help me?


    If I understand the problem, you have, say, 100 integers a_1, a_2,
    ...., a_100 (your 54, 32, 45,... for instance). Some mean person has
    taken some of these integers, computed the average and told you this
    average (eg. 23.33). Your job is to find the integers which were used
    to arrive at this average. Then you are given another average (like
    12.25) and must do the same, etc. Is this correct? (If not, ignore the
    rest.)

    You do realize that the number of subsets of your "hundreds of values"
    is larger than the expected lifetime of the solar system in
    femtoseconds? So even calculating the average of one subset every
    1E-15 seconds wouldn't complete the task in time for it to be
    useful. So in theory your approach is possible (the problem is finite,
    and an exhaustive search on all subsets would certainly do the job),
    but in reality it is only applicable for very small sets of integers.

    Maybe a quantum computer could do it, but I don't know if perl is
    ported to that platfrom yet :)

    --
    Rasmus Villemoes
    <http://rasmusvillemoes.dk/>
    Rasmus Villemoes, Mar 4, 2009
    #2
    1. Advertising

  3. Erol Akman

    Edwards Guest

    On 2009-03-04, Erol Akman <> wrote:
    > I have an array of values and need to calculate every possible average
    > and sort it by the average data:
    >
    > @array = qw(4 6 2 7 1 8 3 5 9 10);
    >
    >
    > This should be printed out:
    > 4+2 = 6, Average =3
    > 4+6+2 = 12, Average=4
    > 4+6 = 10, Average=5
    > ...
    > 10+4+6+2+3 = 25, Average=5
    >
    > also nice to have, but not must have: I need to see which scalar perl
    > used to sum up, maybe "4+6+2 = 12, Average=4, scalar 0,1,3 were summed
    > up"
    >
    > I know, its nuts, but I really need this.
    >
    > Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    > so on and hundreds of values like 54, 32, 45. I don't know which and
    > how many of these values where summed up to get the data above. My
    > idea was to calculate every possible average to find out which values
    > where summed up and averaged out.
    >
    > Is this possible? Can you help me?


    Coding-wise this seems fairly straightforward, but I see a couple of
    potential problems given the background you provide above. The first,
    which you've probably already realized, is that since there are
    different ways of summing to a given number, the resulting averages
    aren't going to be unique. In your example problem, for instance, I
    get several dozen ways (75, actually) of obtaining an average of "6";
    and while "unique" solutions aren't exactly rare (10 is only the
    average of 10; 9.5 of "9 10", 8.6667 of "7 9 10" etc), they are
    definitely in the minority.

    The other is that since you have to look at combinations of elements
    where the number of elements in the combination is also arbitrary, the
    search space is over what I believe is called the "power set" of your
    set of values. In your example, there are 10 values, so the number of
    possible lists made from sets of these values is 2**10... well,
    (2**10)-1, since of course we don't count the empty set. That ran
    pretty quickly on my machine, but you mention "hundreds of values"
    above and 2**100 is what, 10**30 or so? Which, by itself, would
    merely be "nuts" as you say; but in light of the first problem, I am
    betting that most of the time a given average is going to have an
    incredibly huge number of "lists" associated with it, and the averages
    with only a single associated list (ie, a unique list of numbers out
    of your set of possible values that has the given average) are going
    to be quite rare.

    Sorry to sound so negative, but I guess I am trying to say that you
    might want to try to figure out how long this would take (eg benchmark
    starting with a small subset of your data, ten or so as in your
    example problem, then gradually increase the subset size) relative to
    how badly you really need it...

    --
    Darrin
    Edwards, Mar 4, 2009
    #3
  4. Erol Akman wrote:
    > Hi,
    >
    > I have following (crazy) task and was hoping to solve it via perl, but
    > I need your help, please.
    >
    > I have an array of values and need to calculate every possible average
    > and sort it by the average data:
    >
    > @array = qw(4 6 2 7 1 8 3 5 9 10);
    >
    >
    > This should be printed out:
    > 4+2 = 6, Average =3
    > 4+6+2 = 12, Average=4
    > 4+6 = 10, Average=5
    > ...
    > 10+4+6+2+3 = 25, Average=5
    >
    > also nice to have, but not must have: I need to see which scalar perl
    > used to sum up, maybe "4+6+2 = 12, Average=4, scalar 0,1,3 were summed
    > up"
    >
    > I know, its nuts, but I really need this.
    >
    > Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    > so on and hundreds of values like 54, 32, 45. I don't know which and
    > how many of these values where summed up to get the data above. My
    > idea was to calculate every possible average to find out which values
    > where summed up and averaged out.
    >
    > Is this possible? Can you help me?
    >
    > Best regards
    > Erol


    possible? yes.
    practical? NO.
    useful? I doubt it.
    complete in your lifetime? doubtful.
    getting the CORRECT permutation? throw a dart.
    Michael Austin, Mar 4, 2009
    #4
  5. Erol Akman wrote:

    > Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    > so on and hundreds of values like 54, 32, 45. I don't know which and
    > how many of these values where summed up to get the data above. My
    > idea was to calculate every possible average to find out which values
    > where summed up and averaged out.
    >
    > Is this possible? Can you help me?


    At first sight this does not look possible because the brute force
    attack, simply trying all the possibilities, requires something like
    10^30 iterations. However it does remind me of a digital filter
    optimisation problem that two of us did solve over one long summer with
    a VAX and a Transputer array (remember those?). The final version could
    find optimal integer settings for a very large filter in about 4 hours
    of solid bash on the array.

    The trick was to make a first guess, then decide which direction would
    take us closer to what we wanted, effectively deciding which way was
    uphill. Having found a hilltop we then tried making a grid of jumps
    around the neighbourhood to see if we could find a higher hill, then
    climb that, and so on until we reached a point where every change made
    the filter worse.

    It seems to me that every entry in your raw data is either above or
    below the value you want, so you need to find a selection of above and
    below values which pull the average in both directions with equal
    weight. Some sort of "put this in and take that out" iteration might
    well converge in a reasonable time.
    Robert Billing, Mar 5, 2009
    #5
  6. Robert Billing <> writes:

    > Erol Akman wrote:
    >
    >> Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    >> so on and hundreds of values like 54, 32, 45. I don't know which and

    >
    > It seems to me that every entry in your raw data is either above or
    > below the value you want, so you need to find a selection of above and
    > below values which pull the average in both directions with equal
    > weight. Some sort of "put this in and take that out" iteration might
    > well converge in a reasonable time.


    Yes, the OP definitely must use some fancier algorithm than brute
    force to achieve his goal. An obvious observation is that the average
    of some integers is a rational number. 23.33 looks suspicially like it
    is really 23 + 1/3 = 70/3, so one could limit the search to (3
    integers summing to 70) or (6 integers summing to 140) or (9 integers
    summing to 210) or ... I don't know if there are standard algorithms
    to search for solutions for each problem in ().

    This approach of course assumes he knows the floating point data with
    sufficient precision to make an educated guess at the true rational
    representation (is 84.56 really 84 + 5/9, etc?).

    Also, it is still unclear if he wants all subsets with a given average
    or just one.

    --
    Rasmus Villemoes
    <http://rasmusvillemoes.dk/>
    Rasmus Villemoes, Mar 5, 2009
    #6
  7. Erol Akman

    Dr.Ruud Guest

    Erol Akman wrote:

    > I have following (crazy) task and was hoping to solve it via perl, but
    > I need your help, please.
    >
    > I have an array of values and need to calculate every possible average
    > and sort it by the average data:
    >
    > @array = qw(4 6 2 7 1 8 3 5 9 10);
    >
    >
    > This should be printed out:
    > 4+2 = 6, Average =3
    > 4+6+2 = 12, Average=4
    > 4+6 = 10, Average=5
    > ...
    > 10+4+6+2+3 = 25, Average=5
    >
    > also nice to have, but not must have: I need to see which scalar perl
    > used to sum up, maybe "4+6+2 = 12, Average=4, scalar 0,1,3 were summed
    > up"
    >
    > I know, its nuts, but I really need this.
    >
    > Background: I have lots and lots of data like 23.33, 12.25, 84.56 and
    > so on and hundreds of values like 54, 32, 45. I don't know which and
    > how many of these values where summed up to get the data above. My
    > idea was to calculate every possible average to find out which values
    > where summed up and averaged out.
    >
    > Is this possible? Can you help me?


    It seems to me that you have a goal value, like when finding out which
    invoices relate to a single payment.
    --
    Ruud
    Dr.Ruud, Mar 7, 2009
    #7
  8. Erol Akman

    Erol Akman Guest

    Thank you very much for your thoughts and efforts. It made me realize
    how nuts this task really is ;-)

    Taken into account that I could make a few educated guesses as Rasmus
    Villemoes suggested, I would not have "hundreds of values" but 20 or
    even less.

    You've said, that you already ran a calculation on your computer with
    10 values (?), how did you do that? Could you provide me your code? I
    just know the basics:

    #! /usr/bin/perl
    # calculate an average

    use strict;
    use warnings;

    my @array = qw(1 2 3 4 5 6 7 8 9 10 11);
    my $sum = 0;
    $sum += $_ for(@array);
    my $avg = $sum / scalar(@array);
    print "average: ",$avg,"\n";
    print "Values that have been used:","\n";
    print join("\n",@array), "\n";
    print "Number of values: ", scalar(@array), "\n";

    Thanks in advance!
    Erol
    Erol Akman, Mar 9, 2009
    #8
  9. Erol Akman

    Edwards Guest

    Not sure if this was directed at me; if not, sorry, please ignore.

    On 2009-03-09, Erol Akman <> wrote:
    > Thank you very much for your thoughts and efforts. It made me realize
    > how nuts this task really is ;-)
    >
    > Taken into account that I could make a few educated guesses as Rasmus
    > Villemoes suggested, I would not have "hundreds of values" but 20 or
    > even less.
    >
    > You've said, that you already ran a calculation on your computer with
    > 10 values (?), how did you do that? Could you provide me your code?


    Well, I was a bit afraid to do that because it's so ugly; keep in mind
    this was just a quick hack that I threw together based only on your
    initial description of the problem. The output isn't really formatted
    the way you requested, but it should be pretty self-explanatory (the
    number in parentheses after a given average is how many lists of
    values had that average). I've reformatted it a bit (it was originally
    one long line at the command line) and added some comments...

    trurl:~>perl -e 'my @data = qw(4 6 2 7 1 8 3 5 9 10);
    my %avg_list;
    for my $iter (1..2**@data-1) {
    # There must be a better way to get the bit flags from each number,
    # but I couldn't figure out how to get unpack to do what I wanted...
    my @chosen = split "", sprintf("%0" . @data . "b", $iter);
    my @indices = grep($chosen[$_], (0..@data-1));
    # The eval was just a silly way to do the average in one line; you're
    # probably better off doing it the straightforward way (as you posted)
    my $this_avg = (eval join("+", @data[@indices]))/@indices;
    push @{$avg_list{$this_avg}}, [@data[@indices]];
    }

    for my $this_avg (sort {$a <=> $b} keys (%avg_list)) {
    print "$this_avg (", scalar(@{$avg_list{$this_avg}}), "):\n";
    for my $list (@{$avg_list{$this_avg}}) {
    print "\t@{$list}\n";
    }
    }' | more


    Please consider this code "not really tested in any serious way".

    --
    Darrin
    Edwards, Mar 9, 2009
    #9
  10. Erol Akman

    Erol Akman Guest

    Hi Darrin,

    thank you very much for your reply and your script. It works very well
    and made realize, that my problem is hard to solve.

    Thx

    -----
    Erol
    Erol Akman, Mar 25, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. gaga
    Replies:
    4
    Views:
    403
    James Kanze
    Apr 6, 2007
  2. =?Utf-8?B?SXJ3YW5zeWFo?=
    Replies:
    4
    Views:
    2,449
    =?Utf-8?B?SXJ3YW5zeWFo?=
    Oct 30, 2007
  3. Mike D

    calculating average from an array

    Mike D, Nov 19, 2004, in forum: ASP General
    Replies:
    3
    Views:
    119
    dlbjr
    Nov 19, 2004
  4. ed
    Replies:
    11
    Views:
    327
    John W. Krahn
    Oct 4, 2005
  5. ela
    Replies:
    8
    Views:
    196
Loading...

Share This Page