Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 7th, 2003, 06:38 AM
Johie Johie is offline
Python/RDF Freak
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Posts: 14 Johie User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via ICQ to Johie
Generate possible sitemaps

Hello,

I'm trying to make a program that scans a site and take all the links. With these links the program have to generate all possible sitemaps.
To generate all sitemaps is the problem. If there are 4 pages on a site there are 100 possible sitemaps.
I'm still thinking how I can do this. Have someone of you any tips or solutions.

I hope I'm clear enough.
grtz from the Netherlands,

Johie

Reply With Quote
  #2  
Old October 7th, 2003, 03:27 PM
Scorpions4ever's Avatar
Scorpions4ever Scorpions4ever is offline
Banned ;)
Dev Shed God 9th Plane (9000 - 9499 posts)
 
Join Date: Nov 2001
Location: Woodland Hills, Los Angeles County, California, USA
Posts: 9,390 Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level)Scorpions4ever User rank is General 46th Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 4 Weeks 1 Day 22 h 36 m 15 sec
Reputation Power: 4080
How do you generate 100 sitemaps out of 4 links? Assuming we have 4 links linkA, linkB, linkC, linkD, how is the sitemap supposed to look like?
__________________
Up the Irons
What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
"Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
Down with Sharon Osbourne

Reply With Quote
  #3  
Old October 7th, 2003, 03:42 PM
Johie Johie is offline
Python/RDF Freak
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Posts: 14 Johie User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via ICQ to Johie
If I have the 4 links, you can have several possible tree stuctures.

Root ----LinkA-LinkB-LinkC-LinkD
Root ----LinkA-LinkC-LinkB-LinkD

Or

Root----LinkA-LinkB
|--LinkC-LinkD

Root----LinkD-LinkB
|--LinkC-LinkA

Or

Root -----LinkA-LinkB--LinkC
|-LinkD

If you count all possible combinations you get 100 results.
I hope I have answered your question. Maybe now you know what I mean.
grtz
Johie

Reply With Quote
  #4  
Old October 7th, 2003, 07:26 PM
netytan's Avatar
netytan netytan is offline
Hello World :)
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Mar 2003
Location: Hull, UK
Posts: 2,537 netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 18 h 17 m 47 sec
Reputation Power: 68
Send a message via ICQ to netytan Send a message via AIM to netytan Send a message via MSN to netytan Send a message via Yahoo to netytan
Sorry i'm lost, how do you get LinkA, LinkB, LinkC and LinkD to yield 100 possible combinations, it seems like allot to me .. but i might b mistaken. Just out of interest why do you need to generate EVER possible sitemap combo?

As for getting the links in the first place you might want to take a look at urlopen() in the urllib module which will let you read a webpage like any other file. You'll then have to get the links from this, you can do that pretty easily this with Pythons re (regular expressions) module.

Have fun,
Mark.
__________________
programming language development: www.netytan.com Hula


Reply With Quote
  #5  
Old October 14th, 2003, 07:41 AM
Johie Johie is offline
Python/RDF Freak
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Posts: 14 Johie User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via ICQ to Johie
Hi,

After a few trials I saw it is impossible to do this. It takes a long time if there are a lot more links.

But I have another question.
I'd like to parse a website (that's not the problem) but it doesn't work when there are frames on it.

Does anyone know how to parse the frames for links.

I hope this is clear

grtz
johie

Reply With Quote
  #6  
Old October 14th, 2003, 07:49 AM
netytan's Avatar
netytan netytan is offline
Hello World :)
Dev Shed Frequenter (2500 - 2999 posts)
 
Join Date: Mar 2003
Location: Hull, UK
Posts: 2,537 netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level)netytan User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 18 h 17 m 47 sec
Reputation Power: 68
Send a message via ICQ to netytan Send a message via AIM to netytan Send a message via MSN to netytan Send a message via Yahoo to netytan
Probably the best way to parse a frameset would be to get the page referances from the main stage (manually or as part of the program) and then read and parse all the pages connected to the frame.. not too hard

Mark.

Reply With Quote
  #7  
Old October 14th, 2003, 08:33 AM
irishtek irishtek is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Location: Tucson AZ
Posts: 29 irishtek User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via ICQ to irishtek Send a message via AIM to irishtek Send a message via Yahoo to irishtek
sitemap

With only 4 links you are limited to 24 possible sitemaps.
at 5 links you would have 120, but the"root" is always in the first position, as you described it, so it is not effected by this.

Unless you left out information such as each page links to every other page... then you would have 108 possibilities.

This wouldn't be too hard as a loop....
but with additional links this kind of program would get slower.

Perhaps if we knew how/why you would need every possible sitemap and more about the site structure... something simpler can be suggested.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Generate possible sitemaps

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap