Dev Shed Lounge
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsOtherDev Shed Lounge

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old June 5th, 2003, 03:41 AM
John Cook John Cook is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Location: Brisbane, Australia
Posts: 63 John Cook User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 31 m 15 sec
Reputation Power: 6
Search engine friendly directories for links database

I'm constructing a link database and was moving along merrily with just about the whole setup using a single PHP page links.php?catid=23 where catid was the category (eg - categories on star trek, cartoons, etc). Eg - using a database with table Categories with fields catid, category, parentid. Then a table Links with all the link info including the field catid to show where it's listed. Easy peasy.

Then I bumped into 3 words I'd wished I'd never seen: "search engine friendly". It appears I have to get rid of the ?catid=23 at the end of my URL if I want to get into the search engines. No biggie, I could just as easily do links.php/23/ (still haven't got links/23/ to work yet but still working on it). But then I looked at some directory websites and they've got crazy directory structures like:
http://www.yahoo.com/Science_Fictio...k/Deep_Space_9/ (note - this isn't real, I just made it up by way of example)

It looks pretty but is very problematic from a programming point of view - it's much easier to call up a category based on just a single number - the category's primary key catid. Anyway, my question is can anyone suggest the best or at least a good workable way to structure their directories for a links database that's search engine friendly? And if it's something like Yahoos, maybe a general tip on how you structure the database beneath it. Thanx!

Reply With Quote
  #2  
Old June 5th, 2003, 06:37 AM
Sepodati's Avatar
Sepodati Sepodati is offline
Banned
Dev Shed God 19th Plane (14000 - 14499 posts)
 
Join Date: Dec 1999
Location: Afghanistan
Posts: 14,385 Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)  Folding Points: 72299 Folding Title: Intermediate FolderFolding Points: 72299 Folding Title: Intermediate FolderFolding Points: 72299 Folding Title: Intermediate FolderFolding Points: 72299 Folding Title: Intermediate Folder
Time spent in forums: 2 Months 4 Weeks 19 h 12 m 33 sec
Reputation Power: 1784
Send a message via ICQ to Sepodati Send a message via Yahoo to Sepodati
You can have URLS such as

http://www.yahoo.com/Science_Fictio...k/Deep_Space_9/

and still have your categories based on numbers. Yeah, your query is probably going to have WHERE category_name = 'Science Fiction' instead of WHERE category_id = 9, for example, but it can still be efficient if you put an index on the category_name column. You still use the 'category_id' column to link tables together in JOINs.

To a search engine

website.com/23/
webiste.com/Science_Fiction/

those two URLs are about the same. It's going to "like" both of them. To your users, though, the second one is going to be easier to remember and understand what they'll expect to see at the site.

My $0.02...

---John Holmes...

Reply With Quote
  #3  
Old June 5th, 2003, 06:37 AM
Sepodati's Avatar
Sepodati Sepodati is offline
Banned
Dev Shed God 19th Plane (14000 - 14499 posts)
 
Join Date: Dec 1999
Location: Afghanistan
Posts: 14,385 Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)  Folding Points: 72299 Folding Title: Intermediate FolderFolding Points: 72299 Folding Title: Intermediate FolderFolding Points: 72299 Folding Title: Intermediate FolderFolding Points: 72299 Folding Title: Intermediate Folder
Time spent in forums: 2 Months 4 Weeks 19 h 12 m 33 sec
Reputation Power: 1784
Send a message via ICQ to Sepodati Send a message via Yahoo to Sepodati
Also, is this specific to a certain programming language and/or database... or do you just want to discuss it in "general" terms?

Reply With Quote
  #4  
Old June 5th, 2003, 07:26 AM
Hero Zzyzzx's Avatar
Hero Zzyzzx Hero Zzyzzx is offline
11
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2001
Location: Lynn, MA
Posts: 4,635 Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 4 Days 23 h 44 m 19 sec
Reputation Power: 77
Send a message via AIM to Hero Zzyzzx
You can use both and avoid the "where" altogether:


Everything after the 23 is just search engine fluff. That's how it works at my woefully-in-need-of-an-update resume site ( http://www.geekuprising.com ) (go to the "projects") section.

You can use mod_rewrite or the path_info in the environment variables.

Reply With Quote
  #5  
Old June 5th, 2003, 07:54 AM
Mirax's Avatar
Mirax Mirax is offline
Senior Member
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jun 2000
Location: Enschede, The Netherlands
Posts: 1,527 Mirax User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 49 m 2 sec
Reputation Power: 10
A nice article for noobs on mod_rewrite
__________________
There are 10 types of people in this world - those who understand binary and those who don't...

PHP | MySQL | DevShed Forum Search | Google Search

Reply With Quote
  #6  
Old June 5th, 2003, 09:57 AM
jpenn's Avatar
jpenn jpenn is offline
Contributing User
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jun 2002
Location: Washington, DC
Posts: 2,693 jpenn User rank is Sergeant (500 - 2000 Reputation Level)jpenn User rank is Sergeant (500 - 2000 Reputation Level)jpenn User rank is Sergeant (500 - 2000 Reputation Level)jpenn User rank is Sergeant (500 - 2000 Reputation Level)jpenn User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 5 h 41 m 10 sec
Reputation Power: 16
You can also use <forcetype> as an alternative if your server does not have mod_rewrite enabled.

Reply With Quote
  #7  
Old June 5th, 2003, 11:13 AM
computer's Avatar
computer computer is offline
echo $usertitle['computer'];
Dev Shed God 4th Plane (6500 - 6999 posts)
 
Join Date: Jan 2003
Location: UK
Posts: 6,675 computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level)computer User rank is Captain (20000 - 30000 Reputation Level) 
Time spent in forums: 4 Weeks 9 h 12 m 10 sec
Reputation Power: 219
Send a message via ICQ to computer
__________________

Reply With Quote
  #8  
Old June 5th, 2003, 06:47 PM
John Cook John Cook is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Location: Brisbane, Australia
Posts: 63 John Cook User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 31 m 15 sec
Reputation Power: 6
> Also, is this specific to a certain programming language and/or database... or do you just want to discuss it in "general" terms

More just general terms - I have a general idea of how to do the techy stuff like mod rewrites, etc. I'm just wondering what's the best way to do the directory structure. Hero's solution of putting the catid in the directory structure and rest being fluff seems like easiest way to get the best of both worlds. I'm just wondering if there are other ways people have done their directory structure. If you do something like domain.com/science_fiction/star_trek/klingons/, it seems the only way to make that practical is do a search for the category "klingons" to get your catid. So the rest of the directory structure is "fluff" as you say. Plus you have to make sure your last directory is unique - I can't have "fan_fiction" as that might occur more than once - it'd have to be "Star_Trek_Fan_Fiction" and "Star_Wars_Fan_Fiction".

Reply With Quote
  #9  
Old June 5th, 2003, 07:25 PM
Hero Zzyzzx's Avatar
Hero Zzyzzx Hero Zzyzzx is offline
11
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2001
Location: Lynn, MA
Posts: 4,635 Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 4 Days 23 h 44 m 19 sec
Reputation Power: 77
Send a message via AIM to Hero Zzyzzx
?

There isn't an actual directory- I can't tell from your last email if you get that or not.

Anyway- a pure mod-rewrite solution. This should be in a location container:

Code:
RewriteEngine On 
RewriteRule ^/cat/(.*) /perl/doc.cgi?rm=doc&category=$1 [L,P]

Anything coming in as:
Code:
http://www.domain.com/cat/300/Foo/Bar/Baz.html

would be rewritten internally by apache and passed to your script as:
Code:
http://www.domain.com/perl/doc.cgi?rm=doc&category=300/Foo/Bar/Baz.html


you can just retrieve the file_name parameter, split it on the forward slash and take the first value in the array that you split for your real parameter. The only thing you care about is the category id. The rest of it you can change with impunity.

Using numeric ID is preferable anyway- if you decide to change the name of a category, you'll have to maintain some kind of changelog map to make sure you're not giving folks 404s. IDs don't ever have to change.

Reply With Quote
  #10  
Old June 5th, 2003, 10:19 PM
John Cook John Cook is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Location: Brisbane, Australia
Posts: 63 John Cook User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 31 m 15 sec
Reputation Power: 6
Hero, I did realise there was no actual directories so I can see where it's confusing with me asking about "directory structure". I know I have to use mod-rewrites or apache look backs - I'm just at the earlier stage of deciding how I'm going to structure the URLs.

I've been thinking about how you do it - I want to have a single category page which lists all the subcategories dynamically. The only way I can think to do it with your system is to have a column "path" in the Categories table so each category has a path field containing something like "cat/300/Foo/Bar/Baz.html". You'd need that so all the hyperlinks for each subcategory will contain the correct URL. Is that how you do it?

Reply With Quote
  #11  
Old June 7th, 2003, 01:32 AM
SilkySmooth's Avatar
SilkySmooth SilkySmooth is offline
Newbie :P
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Jan 2001
Location: In the PHP Engine :-)
Posts: 2,880 SilkySmooth User rank is Sergeant (500 - 2000 Reputation Level)SilkySmooth User rank is Sergeant (500 - 2000 Reputation Level)SilkySmooth User rank is Sergeant (500 - 2000 Reputation Level)SilkySmooth User rank is Sergeant (500 - 2000 Reputation Level)SilkySmooth User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 11 h 32 m 23 sec
Reputation Power: 15
Hi John,

I have been playing with this myself now for about a year on a variety of web sites and in dealing with a number of professional SEO Guys I have built up a wealth of information.

To start with I would like to clarify a couple of things...

Don't use the underscore in your links, these will become lost in the search engines, instead use a dash.

Don't bother with your category id's in the URL, engines such as Google do actually look in the URL's when it ranks your page, so just a category name will give you a better PR.

If you apply my above recommendation this does NOT mean you will need to use a where clause in your SQL on the category name if you have the right access to use RewriteMap on your server setup. You said you are ok on the technical side of things, so go and read up here http://httpd.apache.org/docs/mod/mo...html#RewriteMap

It basically allows you to create a file in the form of

categoryA idA
categoryB idB

etc, and then perform a lookup in your rewrite rule to get the correct id which is much more efficient in your WHERE clause.

HTH
__________________
---------------------
-- SilkySmooth --
---------------------
Proxy | Little Directory

Reply With Quote
  #12  
Old June 7th, 2003, 08:27 AM
John Cook John Cook is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2003
Location: Brisbane, Australia
Posts: 63 John Cook User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 31 m 15 sec
Reputation Power: 6
Thanx for the advice. I'd already decided to not include the catid in the URL - I like the purity of www.domain.com/Category/Subcategory/

However, I was planning to use underscores in the URL so thanx for the warning about that! Avoiding unnecessary SQL is also useful so thanx for the tips!

Reply With Quote
  #13  
Old June 7th, 2003, 10:12 AM
Hero Zzyzzx's Avatar
Hero Zzyzzx Hero Zzyzzx is offline
11
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2001
Location: Lynn, MA
Posts: 4,635 Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level)Hero Zzyzzx User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 4 Days 23 h 44 m 19 sec
Reputation Power: 77
Send a message via AIM to Hero Zzyzzx
Outside of everything else, I'll tell you that our sites that have URLS like:
http://www.domain.com/cat/400
http://www.domain.com/cat/401
http://www.domain.com/page/33442
have been heavily spidered by Google, probably 80-90% of the pages are grabbed monthly. It works!

Reply With Quote