Computer Science 101 Question - extracting subsets of a collection

R

Roger Varley

Hi

Can someone point me to an efficient algorythm/method that will allow
me to efficiently extract subsets of a collection such that if I have a
collection of n objects, I need to extract all combinations of (n-1)
objects, (n-2) objects etc. So if I have a series of 5 numbers
1,2,3,4,5 then I need all combinations of 4 numbers (1234, 1235 etc),
all combinations of 3 numbers (123, 234, 345 etc)

Regards
Roger
 
J

jan V

Can someone point me to an efficient algorythm/method that will allow
me to efficiently extract subsets of a collection such that if I have a
collection of n objects, I need to extract all combinations of (n-1)
objects, (n-2) objects etc.

Think recursion.
 
R

Roger Varley

Thanks Jan, but I think I'm going to need a bit more of a nudge in the
right direction than that. I should've also pointed out that I just
used the numbers as an example, I actually have a collection of objects
and I am not concerned about the order in which the objects appear
within the subset - so I don't mind wether I get "123", "321" or "213"
as long as I get one of them, but not the others.

Regards
Roger
 
J

jan V

Thanks Jan, but I think I'm going to need a bit more of a nudge in the
right direction than that.

Analyse the problem for the simplest cases: (a) and (a,b). Then you can
build on that by looking at (a,b,c)... and so on. That should allow you to
write a recursive routine which produces the required combinations. [If you
ask me for any more hints, you'll force me to write it all out in code, and
you don't want that, do you ? ;-) ]
I should've also pointed out that I just
used the numbers as an example, I actually have a collection of objects

That will not make any difference to the algorithm, only its detailed
implementation.
and I am not concerned about the order in which the objects appear
within the subset - so I don't mind wether I get "123", "321" or "213"
as long as I get one of them, but not the others.

Then you'll need to involve the services of one of the Set classes.
 
S

Steve Wampler

...

Then you'll need to involve the services of one of the Set classes.

Hmmm, the Set classes aren't needed unless he's worried about
duplicating the subsets at different parts of the tree (that is,
the subset "23" can be generated from both "123" and "234"...)
It's not clear from the original posting whether that's a problem
or not. Perhaps the original poster can clarify?
 
R

Roger Varley

you guys are taking me to CS109 pretty quickly here :) I don't think
I'll duplicate subsets in the way Steve is talking about. I will always
be starting from the root of n objects and generating the combinations
of (n-1), (n-2) ... from the original n objects. I won't be trying to
generate my (n-2) groups from my (n-1) groups - unless Jan's recursive
method (when I've worked it out) takes me down that route.

Regards & thanks
Roger
 
S

Steve Wampler

you guys are taking me to CS109 pretty quickly here :) I don't think
I'll duplicate subsets in the way Steve is talking about. I will always
be starting from the root of n objects and generating the combinations
of (n-1), (n-2) ... from the original n objects. I won't be trying to
generate my (n-2) groups from my (n-1) groups - unless Jan's recursive
method (when I've worked it out) takes me down that route.

Oh, it will, it will... :)

Unless you do some special bookkeeping (using the Set classes Jan
mentioned is one way), you'll get lots of duplicates with a
recursive solution. Here's an example output, without the special
checking:
------------------------
abcd
abc
ab
a
b
ac
a
c
bc
b
c
abd
ab
a
b
ad
a
d
bd
b
d
acd
ac
a
c
ad
a
d
cd
c
d
bcd
bc
b
c
bd
b
d
cd
c
d
---------------------------

where what I think you want is:

----------------------------
abcd
abc
abd
acd
bcd
ab
ac
bc
ad
bd
cd
a
b
c
d
------------------

Directly generating the 2nd solution is more interesting, which is
why I suspect that's what the assignment is.

This must be near the end of the semester in the 101 class...
 
E

Eric Sosman

jan said:
Think recursion.

Even easier: Think binary numbers. Hint: What happens
to the bits of a binary number as you increment it from 0 to
(2^n)-1?
 
J

jan V

Think recursion.
Even easier: Think binary numbers. Hint: What happens
to the bits of a binary number as you increment it from 0 to
(2^n)-1?

Now I'm stumped. I can easily picture the "animating dance" of bits
switching on and off as you increment from 0 to whatever, but I don't see
the connection with the OP's problem.. can you give us an extra hint?
 
R

Raymond DeCampo

jan said:
Now I'm stumped. I can easily picture the "animating dance" of bits
switching on and off as you increment from 0 to whatever, but I don't see
the connection with the OP's problem.. can you give us an extra hint?

abcd
0001
----
d

abcd
0010
----
c

abcd
0011
----
cd

abcd
0100
----
b

abcd
0101
----
b d

etc.

Ray
 
L

Lasse Reichstein Nielsen

jan V said:
Now I'm stumped. I can easily picture the "animating dance" of bits
switching on and off as you increment from 0 to whatever, but I don't see
the connection with the OP's problem.. can you give us an extra hint?

What if each number represents a subset?

/L
 
J

jan V

Now I'm stumped. I can easily picture the "animating dance" of bits
abcd
0001
----
d

abcd
0010
----
c

abcd
0011

That is *so* neat. Don't you just love efficient *and* elegant solutions
like that. Cool.
 
S

Steve Wampler

That is *so* neat. Don't you just love efficient *and* elegant solutions
like that. Cool.

Hmmm, I agree that's a neat approach, but isn't there a fair amount
of gunk that has to be done to complete the solution? For example,
you have to 'walk the bits' to extract the symbols in each subset,
and all the subsets are 'jumbled' (i.e. the length (n-1),(n-2), etc.
collections are interleaved). If I recall correctly, the
OP needed each collection separately. Granted, that's all just
tedium layered on top of the above elegance, and certainly not
hard.

Another approach is to do a breadth-first traversal of the
solution space (which is a DAG), but I'm pretty sure that
approach is well beyond CS 101...
 
E

Eric Sosman

jan said:
That is *so* neat. Don't you just love efficient *and* elegant solutions
like that. Cool.

For extra credit, give a simple formula for the number
of subsets of a set of N elements. For extra extra credit,
give a plausible reason why the set of all subsets of a
given set is known as its "power set." ;-)

<off-topic>

Georg Cantor showed that the cardinality of a set is
strictly less than the cardinality of its power set. This is
"obvious" for finite sets, but Cantor proved that it's true
for infinite sets as well -- thus establishing the existence
of not just one infinity, but an infinitude of infinities of
different sizes.

</off-topic>

We now resume our regularly scheduled programming.
 
R

Roger Varley

Actually it's not a "class" question. I used the CS101 in the heading
tongue in cheek because I suspected that I was asking a pretty basic
question. I've spent the last <mumble> years programming in a
commercial enviornment using languages that don't permit recursion (RPG
anyone?) and I'm only just coming to the Java party.

Regards
Roger
 
J

jan V

commercial enviornment using languages that don't permit recursion (RPG

Get the sharp wooden stake, and the garlic, QUICK !! RPG in 2005, for
*goodness* sake............. The company you work for must be led by truly
enlightened bosses/managers, eh ? Sheesh.
 
E

Eric Sosman

jan said:
Get the sharp wooden stake, and the garlic, QUICK !! RPG in 2005, for
*goodness* sake............. The company you work for must be led by truly
enlightened bosses/managers, eh ? Sheesh.

That's right: It is a capital offense to fail to stay
abreast of the latest fad.

(If you don't think the computer industry is fashion-
driven, you're not thinking.)
 
J

jan V

That's right: It is a capital offense to fail to stay abreast of the
latest fad.

That's not exactly what I said, is it?
(If you don't think the computer industry is fashion-driven, you're not
thinking.)

I totally agree with you that there's a strong bandwagon force. See my post
hinting at such in relation to XML.

But I definitely don't consider Java itself as a fad. Companies have had
almost an eternity (nearly 10 years) to decide whether Java could bring them
the benefits that the rest of the world using Java are experiencing... I
think any company still using Cobol/RPG/etc.. in this day and age needs to
have their CTO/IT manager fired on the spot.
 
E

Eric Sosman

jan said:
latest fad.

That's not exactly what I said, is it?



thinking.)

I totally agree with you that there's a strong bandwagon force. See my post
hinting at such in relation to XML.

But I definitely don't consider Java itself as a fad. Companies have had
almost an eternity (nearly 10 years) to decide whether Java could bring them
the benefits that the rest of the world using Java are experiencing... I
think any company still using Cobol/RPG/etc.. in this day and age needs to
have their CTO/IT manager fired on the spot.

If you'll pardon my saying so: bullshit.

You've just been promoted to CTO of a company that's been
using COBOL and RPG and Autocoder for the last forty years.
Over that time their programming staff has averaged, say, a
dozen people -- so they've got 480 person-years invested in
their software. (This is by no means a "large" enterprise;
plenty of shops have much larger investments.)

Even though it's a modest investment, let's cut it down
a little more just to be generous. Let's say that over forty
years they've decommissioned and replaced two-thirds of the
software they developed. The investment in the code they're
actually using today is a mere 160 person-years.

So: Your first directive to the staff is "All that stuff
you're doing is obsolete, unfashionable, and objectionable.
I forbid you to do any more work on it. You're all going to
Fad Of The Moment school, and when you come back you'll all
get busy replacing the Bad Stuff with Fad Stuff. Make it so."

So your staff of a dozen drop everything and go away to
school, and they all come back trained to the gills and able
to work five times more productively with Fad than they ever
did with Bad. How long will it take them to rewrite all the
existing applications?

Thirty-two months.

Are you going to spend two and two-thirds years of your
IT budget for no net advance in the application services you
can deliver?

What are you going to do when the CEO calls for a new
IT initiative requiring new applications? Tell him "Sorry,
but we're booked solid through April 2009?"

Who deserves to be fired on the spot: the manager who
makes productive use of a legacy, or the one who pisses
away his inheritance?

Okay, so I've made up a lot of numbers out of thin air
(I've made them up to be as favorable to your position as I
could, by the way.) Still, they're thin-air numbers and not
very convincing. So let's boil it down to one decision you
might need to make as CTO: One of your applications depends
on the dates of the Daylight Savings seasonal time adjustment,
and the U.S. Congress has just passed a law changing the way
the adjustment is made. The clock is ticking, and you can
forecast the exact date on which your application will start
misbehaving if it isn't fixed. One of your programmers says
"No problem, boss: The folks who wrote that RPG program made
it easy to carry out adjustments of this sort; it'll be a
three-line change."

What's your response?
 
D

Dale King

Eric said:
If you'll pardon my saying so: bullshit.

You've just been promoted to CTO of a company that's been
using COBOL and RPG and Autocoder for the last forty years.
Over that time their programming staff has averaged, say, a
dozen people -- so they've got 480 person-years invested in
their software. (This is by no means a "large" enterprise;
plenty of shops have much larger investments.)

Even though it's a modest investment, let's cut it down
a little more just to be generous. Let's say that over forty
years they've decommissioned and replaced two-thirds of the
software they developed. The investment in the code they're
actually using today is a mere 160 person-years.

So: Your first directive to the staff is "All that stuff
you're doing is obsolete, unfashionable, and objectionable.
I forbid you to do any more work on it. You're all going to
Fad Of The Moment school, and when you come back you'll all
get busy replacing the Bad Stuff with Fad Stuff. Make it so."

So your staff of a dozen drop everything and go away to
school, and they all come back trained to the gills and able
to work five times more productively with Fad than they ever
did with Bad. How long will it take them to rewrite all the
existing applications?

Thirty-two months.

Are you going to spend two and two-thirds years of your
IT budget for no net advance in the application services you
can deliver?

What are you going to do when the CEO calls for a new
IT initiative requiring new applications? Tell him "Sorry,
but we're booked solid through April 2009?"

Who deserves to be fired on the spot: the manager who
makes productive use of a legacy, or the one who pisses
away his inheritance?

Okay, so I've made up a lot of numbers out of thin air
(I've made them up to be as favorable to your position as I
could, by the way.) Still, they're thin-air numbers and not
very convincing. So let's boil it down to one decision you
might need to make as CTO: One of your applications depends
on the dates of the Daylight Savings seasonal time adjustment,
and the U.S. Congress has just passed a law changing the way
the adjustment is made. The clock is ticking, and you can
forecast the exact date on which your application will start
misbehaving if it isn't fixed. One of your programmers says
"No problem, boss: The folks who wrote that RPG program made
it easy to carry out adjustments of this sort; it'll be a
three-line change."

I'd say the answer lies somewhere between the two extremes. It is not a
sin if someone is still using Cobol or RPG. Many people have legacy
software to deal with and it takes a lot of resources to convert over.

But anyone that doesn't have a plan in place to move away from that
outdated technology is in for a world of hurt. You can't just bury your
head in the sand and say I've got a lot invested in this. That just
isn't sustainable long term.

What happens when the equipment breaks down. Will you still be able to
get the hardware to run the old software on?

I would venture to say the majority of that programming staff with all
that experience is nearing retirement. Who are you going to get to
replace them? That 480 person year investment is going to plummet very
quikcly. New programmers entering the workforce aren't learning Cobol
and RPG and have no interest in working with 40 year old technology. It
will quickly become impossible to find someone to even make that 3 line
change.

I heard someone talking recently about the same thing in relation to the
space shuttle. The shuttle program has been going for 30 years. Most of
the people who worked on it in the beginning are retiring and the new
people don't have the experience.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top