Software Design
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreSoftware Design

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old July 4th, 2008, 03:45 PM
Liglogthepanda Liglogthepanda is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2008
Posts: 1 Liglogthepanda User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 29 m 57 sec
Reputation Power: 0
Red face Efficient sorting into categories with choices

So here's the problem.

I have 8 categories, and about a hundred items. Each item can fit into 1, 2, 3 or 4 of these categories (and we know which ones each item can fit into). The task is to have 'the most even distribution' of the items among the categories, i.e. it is best if all the categories have something in them before putting more than one into a category. Once an item has been placed in a category, it cannot be repeated in another.

Example 1, with eight items
Categories labelled, A to H. Items are represented by X's.
Code:
A | B | C | D | E | F | G | H
----------------------------
X | X | X | X | X | X | X | X .......Table (1)

A | B | C | D | E | F | G | H
----------------------------
X | X | X | X | X |   | X | X 
X |   |   |   |   |   |   |   .......Table (2)

A | B | C | D | E | F | G | H
----------------------------
X | X | X | X | X |   |   | X 
X |   | X |   |   |   |   |   .......Table (3)

In the above case (with eight items), 1 is better than 2 or 3. Basically, to fill a row as quickly as possible is the task, given that some items will only fit into some categories, and that you can't put the same item into two categories in the end!

Example 2, with ten items
Categories labelled, A to H. Items are represented by X's.
Code:
A | B | C | D | E | F | G | H
----------------------------
X | X | X | X | X | X | X | X .......Table (1)
X |   | X |   |   |   |   |

A | B | C | D | E | F | G | H
----------------------------
X | X | X | X | X |   | X | X 
X |   | X | X |   |   |   |   .......Table (2)

A | B | C | D | E | F | G | H
----------------------------
X | X | X | X | X |   |   |  
X | X | X | X |   |   |   |
  | X |   |   |   |   |   |   .......Table (3)

Again, 1 is better than 2 or 3.

Example data:
Code:
Item X could fit into category A.
Item Y could fit into categories A or B.
Item Z could fit into category C.
Item W could fit into categories B or F.


The second stage after this is filling the next row up as quickly as possible, but I'm not too concerned about this at the moment (one step at a time, right!)

I'm having some difficulty thinking of an algorithm for this, and I don't know what to search for in my favourite search engine (which may or may not begin with a G), so help, please!

PS I'll be using PHP/MySQL, not that that really matters at this stage!

Reply With Quote
  #2  
Old July 4th, 2008, 04:48 PM
NovaX NovaX is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2005
Location: Bay Area, California
Posts: 489 NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level)NovaX User rank is Captain (20000 - 30000 Reputation Level) 
Time spent in forums: 2 Weeks 1 h 18 m 18 sec
Reputation Power: 241
Send a message via ICQ to NovaX
My first thought is to treat it as a hash table, where each row is a bucket. Just like a hash table, you want the buckets to be fairly even and can modify closed hashing to rehash as needed. If you weaken the constraints by allowing less than perfect distribution, you can use a hashtable using open-hashing. This would avoid the complexities of rehashing for a row and when the item appears again, trying to figure out if it was hashed somewhere else. If you can do post processing after all the data has been added, you could try to make minor improvements to the balancing.

Basically, I'd see how strict the requirements are and whether it must be done row-oriented (versus column-oriented) and be an online algorithm. If you're just making it overly complex by wanting perfect results, then a simply hashing algorithm is fast and simple. If the requirements are strict, then hashing may not be the best approach but can help you flush out ideas.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreSoftware Design > Efficient sorting into categories with choices


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 3 hosted by Hostway
Stay green...Green IT