# Computer Science 101 Question - extracting subsets of a collection

Discussion in 'Java' started by Roger Varley, Aug 10, 2005.

1. ### Roger VarleyGuest

Hi

Can someone point me to an efficient algorythm/method that will allow
me to efficiently extract subsets of a collection such that if I have a
collection of n objects, I need to extract all combinations of (n-1)
objects, (n-2) objects etc. So if I have a series of 5 numbers
1,2,3,4,5 then I need all combinations of 4 numbers (1234, 1235 etc),
all combinations of 3 numbers (123, 234, 345 etc)

Regards
Roger

Roger Varley, Aug 10, 2005

2. ### jan VGuest

> Can someone point me to an efficient algorythm/method that will allow
> me to efficiently extract subsets of a collection such that if I have a
> collection of n objects, I need to extract all combinations of (n-1)
> objects, (n-2) objects etc.

Think recursion.

jan V, Aug 10, 2005

3. ### Roger VarleyGuest

Thanks Jan, but I think I'm going to need a bit more of a nudge in the
right direction than that. I should've also pointed out that I just
used the numbers as an example, I actually have a collection of objects
and I am not concerned about the order in which the objects appear
within the subset - so I don't mind wether I get "123", "321" or "213"
as long as I get one of them, but not the others.

Regards
Roger

Roger Varley, Aug 10, 2005
4. ### jan VGuest

> Thanks Jan, but I think I'm going to need a bit more of a nudge in the
> right direction than that.

Analyse the problem for the simplest cases: (a) and (a,b). Then you can
build on that by looking at (a,b,c)... and so on. That should allow you to
write a recursive routine which produces the required combinations. [If you
ask me for any more hints, you'll force me to write it all out in code, and
you don't want that, do you ? ;-) ]

> I should've also pointed out that I just
> used the numbers as an example, I actually have a collection of objects

That will not make any difference to the algorithm, only its detailed
implementation.

> and I am not concerned about the order in which the objects appear
> within the subset - so I don't mind wether I get "123", "321" or "213"
> as long as I get one of them, but not the others.

Then you'll need to involve the services of one of the Set classes.

jan V, Aug 10, 2005
5. ### Steve WamplerGuest

On Wed, 10 Aug 2005 15:23:14 +0000, jan V wrote:
>...
>> and I am not concerned about the order in which the objects appear
>> within the subset - so I don't mind wether I get "123", "321" or "213"
>> as long as I get one of them, but not the others.

>
> Then you'll need to involve the services of one of the Set classes.

Hmmm, the Set classes aren't needed unless he's worried about
duplicating the subsets at different parts of the tree (that is,
the subset "23" can be generated from both "123" and "234"...)
It's not clear from the original posting whether that's a problem
or not. Perhaps the original poster can clarify?

Steve Wampler, Aug 10, 2005
6. ### Roger VarleyGuest

you guys are taking me to CS109 pretty quickly here I don't think
I'll duplicate subsets in the way Steve is talking about. I will always
be starting from the root of n objects and generating the combinations
of (n-1), (n-2) ... from the original n objects. I won't be trying to
generate my (n-2) groups from my (n-1) groups - unless Jan's recursive
method (when I've worked it out) takes me down that route.

Regards & thanks
Roger

Roger Varley, Aug 10, 2005
7. ### Steve WamplerGuest

On Wed, 10 Aug 2005 09:16:07 -0700, Roger Varley wrote:

> you guys are taking me to CS109 pretty quickly here I don't think
> I'll duplicate subsets in the way Steve is talking about. I will always
> be starting from the root of n objects and generating the combinations
> of (n-1), (n-2) ... from the original n objects. I won't be trying to
> generate my (n-2) groups from my (n-1) groups - unless Jan's recursive
> method (when I've worked it out) takes me down that route.

Oh, it will, it will...

Unless you do some special bookkeeping (using the Set classes Jan
mentioned is one way), you'll get lots of duplicates with a
recursive solution. Here's an example output, without the special
checking:
------------------------
abcd
abc
ab
a
b
ac
a
c
bc
b
c
abd
ab
a
b
a
d
bd
b
d
acd
ac
a
c
a
d
cd
c
d
bcd
bc
b
c
bd
b
d
cd
c
d
---------------------------

where what I think you want is:

----------------------------
abcd
abc
abd
acd
bcd
ab
ac
bc
bd
cd
a
b
c
d
------------------

Directly generating the 2nd solution is more interesting, which is
why I suspect that's what the assignment is.

This must be near the end of the semester in the 101 class...

Steve Wampler, Aug 10, 2005
8. ### Eric SosmanGuest

jan V wrote:
>>Can someone point me to an efficient algorythm/method that will allow
>>me to efficiently extract subsets of a collection such that if I have a
>>collection of n objects, I need to extract all combinations of (n-1)
>>objects, (n-2) objects etc.

>
>
> Think recursion.

Even easier: Think binary numbers. Hint: What happens
to the bits of a binary number as you increment it from 0 to
(2^n)-1?

--

Eric Sosman, Aug 10, 2005
9. ### jan VGuest

> > Think recursion.
>
> Even easier: Think binary numbers. Hint: What happens
> to the bits of a binary number as you increment it from 0 to
> (2^n)-1?

Now I'm stumped. I can easily picture the "animating dance" of bits
switching on and off as you increment from 0 to whatever, but I don't see
the connection with the OP's problem.. can you give us an extra hint?

jan V, Aug 10, 2005
10. ### Raymond DeCampoGuest

jan V wrote:
>>>Think recursion.

>>
>> Even easier: Think binary numbers. Hint: What happens
>>to the bits of a binary number as you increment it from 0 to
>>(2^n)-1?

>
>
> Now I'm stumped. I can easily picture the "animating dance" of bits
> switching on and off as you increment from 0 to whatever, but I don't see
> the connection with the OP's problem.. can you give us an extra hint?
>
>

abcd
0001
----
d

abcd
0010
----
c

abcd
0011
----
cd

abcd
0100
----
b

abcd
0101
----
b d

etc.

Ray

--
XML is the programmer's duct tape.

Raymond DeCampo, Aug 10, 2005
11. ### Lasse Reichstein NielsenGuest

Re: Computer Science 101 Question - extracting subsets of acollection

"jan V" <> writes:

> Now I'm stumped. I can easily picture the "animating dance" of bits
> switching on and off as you increment from 0 to whatever, but I don't see
> the connection with the OP's problem.. can you give us an extra hint?

What if each number represents a subset?

/L
--
Lasse Reichstein Nielsen -
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'

Lasse Reichstein Nielsen, Aug 10, 2005
12. ### jan VGuest

> > Now I'm stumped. I can easily picture the "animating dance" of bits
> > switching on and off as you increment from 0 to whatever, but I don't

see
> > the connection with the OP's problem.. can you give us an extra hint?

>
> abcd
> 0001
> ----
> d
>
> abcd
> 0010
> ----
> c
>
> abcd
> 0011
> ----
> cd

That is *so* neat. Don't you just love efficient *and* elegant solutions
like that. Cool.

jan V, Aug 10, 2005
13. ### Steve WamplerGuest

On Wed, 10 Aug 2005 22:02:30 +0000, jan V wrote:
>>...
>> abcd
>> 0010
>> ----
>> c
>>
>> abcd
>> 0011
>> ----
>> cd

>
> That is *so* neat. Don't you just love efficient *and* elegant solutions
> like that. Cool.

Hmmm, I agree that's a neat approach, but isn't there a fair amount
of gunk that has to be done to complete the solution? For example,
you have to 'walk the bits' to extract the symbols in each subset,
and all the subsets are 'jumbled' (i.e. the length (n-1),(n-2), etc.
collections are interleaved). If I recall correctly, the
OP needed each collection separately. Granted, that's all just
tedium layered on top of the above elegance, and certainly not
hard.

Another approach is to do a breadth-first traversal of the
solution space (which is a DAG), but I'm pretty sure that
approach is well beyond CS 101...

Steve Wampler, Aug 10, 2005
14. ### Eric SosmanGuest

jan V wrote:
>>>Now I'm stumped. I can easily picture the "animating dance" of bits
>>>switching on and off as you increment from 0 to whatever, but I don't

>
> see
>
>>>the connection with the OP's problem.. can you give us an extra hint?

>>
>>abcd
>>0001
>>----
>> d
>>
>>abcd
>>0010
>>----
>> c
>>
>>abcd
>>0011
>>----
>> cd

>
>
> That is *so* neat. Don't you just love efficient *and* elegant solutions
> like that. Cool.

For extra credit, give a simple formula for the number
of subsets of a set of N elements. For extra extra credit,
give a plausible reason why the set of all subsets of a
given set is known as its "power set." ;-)

<off-topic>

Georg Cantor showed that the cardinality of a set is
strictly less than the cardinality of its power set. This is
"obvious" for finite sets, but Cantor proved that it's true
for infinite sets as well -- thus establishing the existence
of not just one infinity, but an infinitude of infinities of
different sizes.

</off-topic>

We now resume our regularly scheduled programming.

--

Eric Sosman, Aug 10, 2005
15. ### Roger VarleyGuest

Actually it's not a "class" question. I used the CS101 in the heading
tongue in cheek because I suspected that I was asking a pretty basic
question. I've spent the last <mumble> years programming in a
commercial enviornment using languages that don't permit recursion (RPG
anyone?) and I'm only just coming to the Java party.

Regards
Roger

Roger Varley, Aug 11, 2005
16. ### jan VGuest

> commercial enviornment using languages that don't permit recursion (RPG

Get the sharp wooden stake, and the garlic, QUICK !! RPG in 2005, for
*goodness* sake............. The company you work for must be led by truly
enlightened bosses/managers, eh ? Sheesh.

jan V, Aug 11, 2005
17. ### Eric SosmanGuest

jan V wrote:
>>commercial enviornment using languages that don't permit recursion (RPG

>
>
> Get the sharp wooden stake, and the garlic, QUICK !! RPG in 2005, for
> *goodness* sake............. The company you work for must be led by truly
> enlightened bosses/managers, eh ? Sheesh.

That's right: It is a capital offense to fail to stay

(If you don't think the computer industry is fashion-
driven, you're not thinking.)

--

Eric Sosman, Aug 11, 2005
18. ### jan VGuest

> That's right: It is a capital offense to fail to stay abreast of the

That's not exactly what I said, is it?

> (If you don't think the computer industry is fashion-driven, you're not

thinking.)

I totally agree with you that there's a strong bandwagon force. See my post
hinting at such in relation to XML.

But I definitely don't consider Java itself as a fad. Companies have had
almost an eternity (nearly 10 years) to decide whether Java could bring them
the benefits that the rest of the world using Java are experiencing... I
think any company still using Cobol/RPG/etc.. in this day and age needs to
have their CTO/IT manager fired on the spot.

jan V, Aug 11, 2005
19. ### Eric SosmanGuest

[OT] Re: Computer Science 101 Question - extracting subsets of acollection

jan V wrote:
>> That's right: It is a capital offense to fail to stay abreast of the

>
>
> That's not exactly what I said, is it?
>
>
>> (If you don't think the computer industry is fashion-driven, you're not

>
> thinking.)
>
> I totally agree with you that there's a strong bandwagon force. See my post
> hinting at such in relation to XML.
>
> But I definitely don't consider Java itself as a fad. Companies have had
> almost an eternity (nearly 10 years) to decide whether Java could bring them
> the benefits that the rest of the world using Java are experiencing... I
> think any company still using Cobol/RPG/etc.. in this day and age needs to
> have their CTO/IT manager fired on the spot.

If you'll pardon my saying so: bullshit.

You've just been promoted to CTO of a company that's been
using COBOL and RPG and Autocoder for the last forty years.
Over that time their programming staff has averaged, say, a
dozen people -- so they've got 480 person-years invested in
their software. (This is by no means a "large" enterprise;
plenty of shops have much larger investments.)

Even though it's a modest investment, let's cut it down
a little more just to be generous. Let's say that over forty
years they've decommissioned and replaced two-thirds of the
software they developed. The investment in the code they're
actually using today is a mere 160 person-years.

So: Your first directive to the staff is "All that stuff
you're doing is obsolete, unfashionable, and objectionable.
I forbid you to do any more work on it. You're all going to
Fad Of The Moment school, and when you come back you'll all
get busy replacing the Bad Stuff with Fad Stuff. Make it so."

So your staff of a dozen drop everything and go away to
school, and they all come back trained to the gills and able
to work five times more productively with Fad than they ever
did with Bad. How long will it take them to rewrite all the
existing applications?

Thirty-two months.

Are you going to spend two and two-thirds years of your
IT budget for no net advance in the application services you
can deliver?

What are you going to do when the CEO calls for a new
IT initiative requiring new applications? Tell him "Sorry,
but we're booked solid through April 2009?"

Who deserves to be fired on the spot: the manager who
makes productive use of a legacy, or the one who pisses
away his inheritance?

Okay, so I've made up a lot of numbers out of thin air
(I've made them up to be as favorable to your position as I
could, by the way.) Still, they're thin-air numbers and not
very convincing. So let's boil it down to one decision you
might need to make as CTO: One of your applications depends
on the dates of the Daylight Savings seasonal time adjustment,
and the U.S. Congress has just passed a law changing the way
forecast the exact date on which your application will start
misbehaving if it isn't fixed. One of your programmers says
"No problem, boss: The folks who wrote that RPG program made
it easy to carry out adjustments of this sort; it'll be a
three-line change."

--

Eric Sosman, Aug 11, 2005
20. ### Dale KingGuest

Re: [OT] Re: Computer Science 101 Question - extracting subsets ofa collection

Eric Sosman wrote:
>
> jan V wrote:
>
>>> That's right: It is a capital offense to fail to stay abreast of the

>>
>>
>>That's not exactly what I said, is it?
>>
>>
>>
>>> (If you don't think the computer industry is fashion-driven, you're not

>>
>>thinking.)
>>
>>I totally agree with you that there's a strong bandwagon force. See my post
>>hinting at such in relation to XML.
>>
>>But I definitely don't consider Java itself as a fad. Companies have had
>>almost an eternity (nearly 10 years) to decide whether Java could bring them
>>the benefits that the rest of the world using Java are experiencing... I
>>think any company still using Cobol/RPG/etc.. in this day and age needs to
>>have their CTO/IT manager fired on the spot.

>
>
> If you'll pardon my saying so: bullshit.
>
> You've just been promoted to CTO of a company that's been
> using COBOL and RPG and Autocoder for the last forty years.
> Over that time their programming staff has averaged, say, a
> dozen people -- so they've got 480 person-years invested in
> their software. (This is by no means a "large" enterprise;
> plenty of shops have much larger investments.)
>
> Even though it's a modest investment, let's cut it down
> a little more just to be generous. Let's say that over forty
> years they've decommissioned and replaced two-thirds of the
> software they developed. The investment in the code they're
> actually using today is a mere 160 person-years.
>
> So: Your first directive to the staff is "All that stuff
> you're doing is obsolete, unfashionable, and objectionable.
> I forbid you to do any more work on it. You're all going to
> Fad Of The Moment school, and when you come back you'll all
> get busy replacing the Bad Stuff with Fad Stuff. Make it so."
>
> So your staff of a dozen drop everything and go away to
> school, and they all come back trained to the gills and able
> to work five times more productively with Fad than they ever
> did with Bad. How long will it take them to rewrite all the
> existing applications?
>
> Thirty-two months.
>
> Are you going to spend two and two-thirds years of your
> IT budget for no net advance in the application services you
> can deliver?
>
> What are you going to do when the CEO calls for a new
> IT initiative requiring new applications? Tell him "Sorry,
> but we're booked solid through April 2009?"
>
> Who deserves to be fired on the spot: the manager who
> makes productive use of a legacy, or the one who pisses
> away his inheritance?
>
> Okay, so I've made up a lot of numbers out of thin air
> (I've made them up to be as favorable to your position as I
> could, by the way.) Still, they're thin-air numbers and not
> very convincing. So let's boil it down to one decision you
> might need to make as CTO: One of your applications depends
> on the dates of the Daylight Savings seasonal time adjustment,
> and the U.S. Congress has just passed a law changing the way
> the adjustment is made. The clock is ticking, and you can
> forecast the exact date on which your application will start
> misbehaving if it isn't fixed. One of your programmers says
> "No problem, boss: The folks who wrote that RPG program made
> it easy to carry out adjustments of this sort; it'll be a
> three-line change."

I'd say the answer lies somewhere between the two extremes. It is not a
sin if someone is still using Cobol or RPG. Many people have legacy
software to deal with and it takes a lot of resources to convert over.

But anyone that doesn't have a plan in place to move away from that
outdated technology is in for a world of hurt. You can't just bury your
head in the sand and say I've got a lot invested in this. That just
isn't sustainable long term.

What happens when the equipment breaks down. Will you still be able to
get the hardware to run the old software on?

I would venture to say the majority of that programming staff with all
that experience is nearing retirement. Who are you going to get to
replace them? That 480 person year investment is going to plummet very
quikcly. New programmers entering the workforce aren't learning Cobol
and RPG and have no interest in working with 40 year old technology. It
will quickly become impossible to find someone to even make that 3 line
change.

I heard someone talking recently about the same thing in relation to the
space shuttle. The shuttle program has been going for 30 years. Most of
the people who worked on it in the beginning are retiring and the new
people don't have the experience.

--
Dale King

Dale King, Aug 14, 2005