Software Design
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreSoftware Design

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old April 10th, 2003, 06:09 PM
banton banton is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: California
Posts: 6 banton User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Challenging algorithm problem.

Hi all,
I'll try to describe the problem with my poor english
I have N sets of consecutive values, for example:

Set 1 = { 1,2,3,4,5 }
Set 2 = { 4,5,6 }
Set 3 = { 3,4,5,6,7,8,9 }
Set 4 = { 0,1,2,3 }

I need to construct all the possible sets of N values, each of which is taken from each of the N sets, for example:

{ 1,4,3,0 } { 1,4,3,1 } { 1,4,3,2 } { 1,4,3,3 }
{ 1,4,4,0 } { 1,4,4,1 } { 1,4,4,2 ] { 1,4,4,3 }

and so on...

There would be 5x3x7x4=420 possible sets, but:
two sets are equivalent if they contain the same values (as definition of SET), for example:

{ 3,4,4,0 } is equivalent to { 4,4,3,0 } and one of these can be "discarded"

What's the best algorithm to build all the possible sets (without duplicate sets)?

The average case will be N=20 and set size of 10.

Challenging huh?

Antonio

Last edited by banton : April 15th, 2003 at 01:59 PM.

Reply With Quote
  #2  
Old April 10th, 2003, 06:40 PM
infamous41md's Avatar
infamous41md infamous41md is offline
not a fan of fascism (n00b)
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Feb 2003
Location: ct
Posts: 2,756 infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Days 11 h 4 m 29 sec
Reputation Power: 26
im thinking here.... and maybe a method to use would be to build a set, then compare it to the sets built, if there is an identical one, discard it. surely there are other ways that may be more efficient tho... are you concerened with performance here?
edit: i was thinking of using an array of pointers to arrays, or perhaps a vector of pointers to arrays.

Reply With Quote
  #3  
Old April 10th, 2003, 08:25 PM
banton banton is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: California
Posts: 6 banton User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Yes, I'm concerned about performance.
If you have 20 sets of 10 values you need to have a loop of 7,766,279,631,452,241,920 (10^20) iterations when the actual final set number will be much lower.

A small improvement can be using steps like this:
1. I get the first set and build a set for each element
2. To each one-value set I add the second set values
3. Discard the duplicated two-values sets
4. Add the third set values
5. Discard the 3-values duplicate sets
and so on...

For example:
{1,2,3}
{2,3}
{4,5}

1. {1} {2} {3}
2. {12} {13} {22} {23} {32} {33}
3. {12} {13} {22} {23} {33}
4. {124} {125} {134} {135} {224} {225} {234} {235} {334} {335}
5. nothing to do in this case

There must be a better algorithm to solve this problem in less then O(x^N) :-\

Antonio.

( the symbol ^ means power )

Reply With Quote
  #4  
Old April 10th, 2003, 09:43 PM
infamous41md's Avatar
infamous41md infamous41md is offline
not a fan of fascism (n00b)
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Feb 2003
Location: ct
Posts: 2,756 infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level)infamous41md User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Days 11 h 4 m 29 sec
Reputation Power: 26
heh... it seems as tho ur prolly better equipped to solve this then myself, but i always enjoy a challenge so i will try to help.

edit: first let me get a few things straight.
1)is the minimun set length 1?
2)is there a maximum length?
3)can u use only one element from each set when creating a new one?

Last edited by infamous41md : April 10th, 2003 at 09:47 PM.

Reply With Quote
  #5  
Old April 11th, 2003, 02:53 AM
Maldor's Avatar
Maldor Maldor is offline
Muhhnnn !!
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2003
Posts: 1,530 Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level)Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level)Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level)Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level)Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level)Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level)Maldor User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 1 Day 7 h 38 m 2 sec
Reputation Power: 83
You can gain a bit on performances, taking on memory space.

1- You are generating your sets one by one (wich is not especially complex)
2- associated to your set, you generate a second set with elements sorted: {1,2,5,3} -> {1,2,3,5} (Can even be 1235 which is even better but implies that your elements are 0<x<9)
3- You insert those seconds set in a sorted list (trishell would fit)
4- You generate the next set and his associated second set (key) and search for the key in the sorted key list.

In this way, you will decrease your Average complexity while keeping quite the same Max complexity...

Don't really have time to calculate the complexity atm...

Reply With Quote
  #6  
Old April 12th, 2003, 04:29 AM
HiFidelity HiFidelity is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: UK
Posts: 13 HiFidelity User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
A better algorithm would be to check your initial sets eg.
{1,2,3}
{2,3} <---2 repeats
{1,5} <---1 repeat
for repeated numbers and if the number of repeats in any set is >=2 to cut the stem accordingly.

Looking for repeated pairs (or more) in your method would take a really long time with large sets.

This way you you can ignore {3,2} at step 3, and you know that the total number of sets will be the product of the number of elements in each set minus the sum of (the product of the number of elements in each set after a repeated pair) = 3x2x2 - 2 = 10 here.

What do you need it for?

Reply With Quote
  #7  
Old April 15th, 2003, 01:56 PM
banton banton is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: California
Posts: 6 banton User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Yes, hiFidelity.
I understand your logic but how do you generate the result sets once you know that there will be duplicate sets?
Do you already have an algorithm in mind? If so, please explain it to me.

I agree with you, generating all the possible set and then looking for duplicate it's not the best idea but only the first algorithm that pops up in mind.

What do I need it for? It's an application that simualates a complex behaviour from a set of input parameter. Each parameter can have a value within a range.

Antonio.

Reply With Quote
  #8  
Old April 16th, 2003, 05:23 AM
Onag Onag is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: Seattle, WA
Posts: 14 Onag User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Arrange the sets in ascending order based on the first number in each set. Now store each set of 2+ duplicates for each set. For example, given the following 3 initial sets:

A {1,2,3}
B {2,3,4,5}
C {3,4}

You would store these duplicates in separate sets:

dA {} - always leave first set empty
dB {2,3}
dC {3,4}

Note how each set only cares about duplicates in the sets prior to it. For example, set B doesn't care about the contents of set C or any set that migh come after it.

Now we start building the new sets. As we build them, we check the stored duplicates to see if they're needed.

From A: {1} - don't know if it's a duplicate yet
From B: {1,2} - RED FLAG - 2 is in the duplicate set for B

If any elements prior to this one are also in the duplicate set and are greater than this element, we can skip this set, as it is/will be a duplicate.* This is not the case (1 is not in the duplicate set for B), so we continue.

From C: {1,2,3} - ANOTHER RED FLAG - 3 is in the duplicate set for C

We check to see if any prior elements (1 or 2) are also in the duplicate set for 3, which they aren't. This set is good.

Continuing on, we come up with the following sets:

{1,2,3} - from above
{1,2,4}
{1,3,3}
{1,3,4}

Let's take a look at the next set more closely.

From A {1} - we never know with the first element
From B {4} - RED FLAG - but everything checks out
From C {3} - RED FLAG - 3 is in the duplicate set for C

Upon checking for prior elements that are also in the duplicate set for C, we find that the second element (4) is in the set as well, and is also greater than the third element. This is a duplicate set and can be skipped.

Let's continue.

{1,4,4}
{1,5,3}
{1,5,4}
{2,2,3}
{2,2,4}
{2,3,3}
{2,3,4}
{2,4,3} - Duplicate - 3 and 4 are both in duplicate set for C and 4 is greater than 3 - ignored
{2,4,4}
{2,5,3}
{2,5,4}
{3,2,3} - Duplicate - ignored
{3,2,4} - Duplicate - ignored
{3,3,3}
{3,3,4}
{3,4,3} - Duplicate - ignored
{3,4,4}
{3,5,3}
{3,5,4}

That's it. We're left with only the unique sets. I don't know if this makes sense or if it even helps you at all. You'll of course want to check my logic, as I've not tested this on any significant data. It seems like it should work, though.

* You could also ignore sets where the prior elements are less than the current element. The goal is, given the two sets {1, 2} and {2, 1}, one of them needs to be deleted. Which one is arbitrary.

-Nag

Reply With Quote
  #9  
Old April 16th, 2003, 06:02 PM
banton banton is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: California
Posts: 6 banton User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Thumbs up

Onag, this is a great algorithm.
I implemented it and it seems working perfectly.
Attached there is the java code if anyone wants to try it out.

Thank you.
Antonio.
Attached Files
File Type: java combinations.java (6.5 KB, 361 views)

Reply With Quote
  #10  
Old April 22nd, 2003, 01:26 PM
banton banton is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: California
Posts: 6 banton User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Thumbs up

The following algorithm is 200 times faster, if anyone is interested. For instance, for 8 set of 10 values, with an overlapping rate of 75%, it generates the same output (about 3.3 million results) in 300 milliseconds, while the algorithm from Nag it takes about 50 seconds.

public class TestComb {
int len; int[] vals;
// the current set
int[][] data;
TestComb(int[][] data)
{
this.data = data;
len = data.length;
vals = new int[len];
for (int x = 0; x < len; x++)
{ // initialize first set
vals[x] = data[x][0];
}
int x;
long count = 0; int millions = 0;
long time = System.currentTimeMillis();
do {
count++;
if (count == 1000000)
{
count = 0; millions++; System.out.print((System.currentTimeMillis()-time)/millions + " ");
}

//print(vals);
x = len;
// current index to update
while (x > 0)

{ // stop when no value can be updated
x--;
if (++vals[x] <= data[x][1])
{ // is val[x]+1 still valid
while (++x < len)
{ // set all other indexes to min allowed values
setMin(x);
}
break;
}
}
}
while (x > 0);
System.out.println((count + millions*1000000) + " valid sets");
}

void setMin(int idx)
{
vals[idx] = data[idx][0];
for (int x = 0; x < idx; x++)
{
if (vals[x] <= data[idx][1])
{
vals[idx] = Math.max(vals[x], vals[idx]);
}
}
}

void print(int[] val)
{
for (int x = 0; x < val.length; x++)
{
if (x == 0)
{ System.out.print("{"); }
else { System.out.print(','); }
System.out.print(val[x]); } System.out.println("}");
}

public static void main(String[] args)
{ int[][] data = {{0,0}, {0,1}, {0,4}, {1,5}, {1,4}}; // since using consecutive values, only need first and last values
new TestComb(data); // data must be pre sorted by first value
}
}

Antonio

Reply With Quote
  #11  
Old April 23rd, 2003, 03:11 PM
banton banton is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2003
Location: California
Posts: 6 banton User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
I am trying to calculate the number of valid results that input sets will generate before actually starting processing, and that's the next 'challange'. I know this is going to be strictly a math problem and I guess I am not good in math !! :-)
For those of you who are, if anyone has an idea how to find that out please let me know.

Why do I need that?
In the user interface it will be nice to let the user know how long it will take to process his/her request before actually starting processing it. Also a progress bar will display the total and current number of results already processed.

Antonio.

Reply With Quote
  #12  
Old May 26th, 2003, 01:24 PM
md_doc
Guest
Dev Shed Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
ugh never mind just reread what you wrote and my post makes no sense now

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreSoftware Design > Challanging algorithm problem.


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support |