I'm wondering if anyone can give me any guidance as to the best way to achieve the following ranking system.
It's complex I know, but here is an overview of what I'm trying to do. To make it easier, here are the tables of the objects I'm talking about.
We have discussion topics ...
Code:
CREATE TABLE `tbl_discussion` (
`D_ID` int(11) unsigned NOT NULL AUTO_INCREMENT,
`U_ID` int(11) unsigned NOT NULL,
`title` varchar(32) NOT NULL,
`content` blob,
PRIMARY KEY (`D_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
... and we have evidence (data sets) ...
Code:
CREATE TABLE `tbl_evidence` (
`E_ID` int(11) unsigned NOT NULL AUTO_INCREMENT,
`ET_ID` int(11) unsigned NOT NULL,
`S_ID` int(11) unsigned NOT NULL,
`U_ID` int(11) unsigned NOT NULL,
`title` varchar(64) NOT NULL,
`summary` blob NOT NULL,
`url` varchar(255) NOT NULL,
PRIMARY KEY (`E_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
ET_ID in this table links to a define a 'base score' for certain types of data based on reliability / credibility. So automated digital tracking data would have a high score, part time manual logging low, etc.
In case it helps, here is that table.
Code:
CREATE TABLE `tbl_evidence_type` (
`ET_ID` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(32) NOT NULL,
`base_point` tinyint(2) NOT NULL DEFAULT '0',
PRIMARY KEY (`ET_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
For my initial testing, I'm linking these with ...
Code:
CREATE TABLE `tbl_discussion_evidence` (
`E_ID` int(11) unsigned NOT NULL,
`T_ID` int(11) unsigned NOT NULL,
PRIMARY KEY (`E_ID`,`T_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
What I want to do is work out what pieces of evidence are linked together most frequently and give them a score based upon this relationship.
More specifically, I want to be able to run this (perhaps as a trigger) after a piece of evidence is linked.
It needs to find all topics which it linked to, and then find all of the other evidence that is linked to those topics.
I can get this far, with something like ...
Code:
SELECT
tbl_discussion_evidence.E_ID,
count(tbl_discussion_evidence.T_ID)
FROM
tbl_discussion_evidence
WHERE
tbl_discussion_evidence.T_ID IN (SELECT
tbl_discussion_evidence.T_ID
FROM
tbl_discussion_evidence
WHERE
tbl_discussion_evidence.E_ID = 1)
GROUP BY
tbl_discussion_evidence.E_ID
... where 1 = the ID of the evidence being linked.
But this is where it gets much more complex, and where either I can't see how to make it work, or my approach is incorrect.
When you link a piece of evidence to a discussion, all the links between the evidence need to be evaluated for all evidence linked to that topic.
Points are awarded as 2% of base point for every link over 3 to a maximum of 8 links (10%)
E1 base = 60
E2 base = 50
E3 base = 30
6 links found between E1 and E2
3 links found between E2 and E3
8 links found between E3 and E1
E1 awarded:
- 6% of 50 (base of E2) = 3
- 10% of 30 (base of E3) = 3
E2 awarded:
- 6% of 60 (base of E1) = 3.6
E3 awarded:
- 10% of 60 (base of E1) = 6
I understand this is probably easiest done via multiple queries and PHP, however, given that all of the required information is at hand on the database side, it seems somehow redundant.
So, does anyone have any suggestions as to an alternate approach or how to advance my current one?
Any help appreciated, this is doing my head in.