Database Management
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsDatabasesDatabase Management

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old February 10th, 2004, 06:02 PM
saodl saodl is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Location: Los Angeles, CA
Posts: 21 saodl User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Question Database design: relationship convergence

SUMMARY:
I have many similar tables that should all be related to each other through many to many relationships, and I am not sure what the best way to do this is.
I am currently considering storing all the relationships in a central 'convergence' table. I am not certain this is the best way. Any advice would be appreciated.

DETAILS:

I am working on a database for a multi-lingual bible study tool. In order for the cross-language implementation to work, words in the English (and other - Chinese at the moment) need to be related to the greek (or hebrew) word(s) they from which they are translated, as well as to each other. This lets me produce an 'interlinear' form:

ie (using greek transliterations)
en: me and you
gk: ego kai su

For this reason (and in order to link them to definitions) words are stored individually in the database. At the moment, because it is cleaner and because I may add more languages/versions later, I am keeping each version/language in it's own table.

Conceptually most words in every version should connect to 1 or more words in any other version. For two tables this is no problem and I use a standard many to many construct:

english: enID*, word
greek: gkID*, word
en_gk_relation: {enID(fk), gkID(fk)}*

However when I start getting more tables, it becomes more complex:
  • I can define a relationship table for every connection. However, at the moment I will have 4-6 tables and that will grow in the future- that means 6-15 relationship tables. (n-1)! grows pretty fast. I could limit myself to defining in advance which tables can be related, but that is counter to the way this tool is meant to be used.
  • I can create a few expected-use relationships, and let the rest be created when requested by a stored procedure ("Building index for first use"). That would come with its own problems, but it would keep the database from being unnecessarily bloated.
  • I could let english be related to chinese Through the greek table (en <*-*> gk <*-*> ch), but the Chinese and English may be similar to each other in ways they are not similar to the Greek. This makes sense after a fashion because the real semantic relationship is that they were both translated from the greek.
  • I am currently considering using a centralized relation table of the form {enID(fk), gkID(fk), chID(fk)...}*. This seems like it could work but it also seems unwieldy.
  • Other options...?

One final problem I have is that sometimes there is simply no match for a greek word in the english (or vice-versa). In the reference table this effectively produces a NULL value, which violates the primary key. I can circumvent the error (put a 0), but that's not really any better from a functional standpoint.

In these cases, I would still like to include the missing word in as near the right order as possible (after the previous greek word, or before the next, according to the greek order). However with a NULL or a substitute value, it is rather difficult to get any closer than inside the right verse...

Thanks for your help and opinions.

siri

Last edited by saodl : February 10th, 2004 at 06:40 PM.

Reply With Quote
  #2  
Old February 11th, 2004, 09:03 AM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 18,235 r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 4 Weeks 1 Day 15 h 57 m 34 sec
Reputation Power: 1054
Re: Database design: relationship convergence

Quote:
Originally posted by saodl
At the moment, because it is cleaner and because I may add more languages/versions later, I am keeping each version/language in it's own table.
seems to me that one table is cleaner, and a heckuva lot easier to add new languages to

just add langID, foreign key to language table, where you can give its name, explanatory notes about the language, etc.
Quote:
I am currently considering using a centralized relation table of the form {enID(fk), gkID(fk), chID(fk)...}*. This seems like it could work but it also seems unwieldy.
try just one many-to-many table, with {langfromID, fromID, langtoID, toID}
Quote:
One final problem I have is that sometimes there is simply no match for a greek word in the english (or vice-versa). In the reference table this effectively produces a NULL value, which violates the primary key. I can circumvent the error (put a 0), but that's not really any better from a functional standpoint.
true, that would violate a pk constraint, but this would matter little, as that table does not have children, so may not need a pk at all... note that foreign keys can be null, no problem
__________________
r937.com | rudy.ca

pre-order my book Simply SQL from Amazon

Reply With Quote
  #3  
Old February 12th, 2004, 02:37 PM
saodl saodl is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Location: Los Angeles, CA
Posts: 21 saodl User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Thanks for taking a look at my problem Rudy. I was actually kinda hoping you would. ^_^

You're suggesting that I use a database organized like this?:

easyPkID, langId, bibleLoc, verseOrder, Text, otherInfo

1, en, mat 1:1, 1, The book, otherStuff
...
135527, gk, mat 1:1, 1, Bibloj, otherStuff
...
272135, ch, mat 1:1, 1, [chinese], otherStuff

and so on?
(yes- lang, loc, order together are a cantidate key, but it's more convenient here to refer to a numeric pk.)

Then use a many to many table to link translations together:

fromID, toID
1, 135527
1, 272135
135527, 272135

Is that right? I hadn't considered that...when I thought of combined tables, I thought of built in relationships:

easyPkID, en_bibleLoc, en_verseOrder, en_Text, en_Stuff, gk_bibleLoc, ...

Which is why I considered seperate tables better...for some reason it didn't occur to me to put english, greek, and chinese text all in the same column.

I'm not certain what I should do, but I am certain that you have more experience than me so I'm gonna have to think about this a little. Now it looks to me like I was making a classic foolish mistake.

Reply With Quote
  #4  
Old February 12th, 2004, 02:58 PM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 18,235 r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 4 Weeks 1 Day 15 h 57 m 34 sec
Reputation Power: 1054
classic? perhaps not in the way you mean

foolish? hardly -- the experience you gain by building different designs is very valuable

mistake? no

nothing is a mistake if it works

my rules for good design are

1. it works
2. it's efficient
3. it's easy to understand, and therefore maintain

Reply With Quote
  #5  
Old February 12th, 2004, 05:39 PM
saodl saodl is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Location: Los Angeles, CA
Posts: 21 saodl User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
pre-script: I am thinking out loud here, and don't require a response or that you read this - unless you feel like it. Thanks again for your help.

The way I want to think of this data is that there is an ideal correct original, and each language/version is a rendering of this. As such most of the words in each version have a direct correlation in terms of being a translation of the original.

This seems much a
ID, value, interpretation1, interpretation2, interpretation3, etc.
type of problem. So I'm representing different possibilities for the same value as

normalization suggests that this type of problem is solved with-

origID*, value

and

{origID(fk), version}*, interpretation

Because I don't have access to the Original value, I can eliminate the first table- leaving me with what you suggested.

Seperating the resulting table into multiple tables by version doesn't really do anything for me except make maintainance and programming harder.

While there is something to be said for denormalizing in the interest of efficiency and understanding, creating and maintaining a pre-joined table gets excessively complex after the 3rd or 4th translation/version. In the end, the simpler design fits the your 3 rules better.

Reply With Quote
  #6  
Old February 12th, 2004, 05:45 PM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 18,235 r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level)r937 User rank is General 1st Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 4 Weeks 1 Day 15 h 57 m 34 sec
Reputation Power: 1054
not sure where you are going with the versioning

i still see only one table for you, or at most two, one for the text and one for the relationships

regarding my rules, i just wrote about them the other day with regard to queries: http://searchdatabase.techtarget.co...x285649,00.html

Reply With Quote
  #7  
Old February 12th, 2004, 06:06 PM
saodl saodl is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Location: Los Angeles, CA
Posts: 21 saodl User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Versioning just has to do with different versions of the bible...I have two greek versions at the moment, and 2 each also of chinese and english. Misuse of keyword.

I was just thinking out loud and internalizing why you were right. Since I work out of my house and have no one to talk to, I was typing it instead.

Read your article (and several others) already...picked the column up off your website. Good stuff.

Reply With Quote
Reply

Viewing: Dev Shed ForumsDatabasesDatabase Management > Database design: relationship convergence


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 1 hosted by Hostway
Stay green...Green IT