Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old December 1st, 2012, 11:25 AM
mhsiao45 mhsiao45 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2007
Posts: 29 mhsiao45 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 56 m 26 sec
Reputation Power: 0
Help Please -how to track Website changes?

Hi,

I posted this question in PHP forum, and one of the members said if they had a preference, they would do this in PERL.
My question here is, would anyone be interested in helping develop this? How tough does this sound?

Ideally, I'd like to get some sort of email notice, whenever a certain change is made (removal of certain links).


ANOTHER MEMBERS APPROACH:

This would not necessarily need PHP. If I were doing this I would periodically grab the pages of interest and store them somewhere. I'd then do a diff on the latest page and the corresponding previously stored page. I would then analyze those diff's and make the appropriate notifications.

As a matter of personal preference I would use perl to do this.



HERE IS MY ORIGINAL QUESTION BELOW:



Sorry if I have posted this in the complete wrong place. I have no knowledge on web programming what so ever.

I was hoping someone could help out.

Is there a way to track certain changes to other websites?

I work for a manufacturers sales rep firm. Ideally, we want to rep the best manufacturers'. If you can imagine, the best are usually already taken, unless that manufacturer feels that rep firm isn't doing well for them. At that point, the rep firm gets dropped, and usually the rep firm will remove them from their website.

Thats where we want to take action and pick up these manufacturers while they are open. However, its very hard to find out without constantly checking every rep firms website. Either that, or just word of mouth.

It would be great if there was a way to track the removal of say, certain links (manufacturer's links) or images from a certain web page. Ideally, we would want to get notified when a change occurs, and then we can go see what link was removed.


For example, thsi rep firm lists all their Manufacturers here:
http://www.yando.com/caline.htm

Say CREE Microwave is removed. We would want some way to be notified when that happens.

Any ideas on how to achieve this?

Reply With Quote
  #2  
Old December 1st, 2012, 11:37 AM
keath's Avatar
keath keath is offline
!~ /m$/
Dev Shed Specialist (4000 - 4499 posts)
 
Join Date: May 2004
Location: Reno, NV
Posts: 4,085 keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Weeks 4 Days 6 h 51 m 10 sec
Reputation Power: 1809
It's easy in perl, and in several other languages as well. A simple perl script could be launched from cron to check websites at whatever interval you wanted.

Are you looking to do the work yourself, or to hire someone? If you are doing this yourself, you'll want to start with LWP ; specifically the UserAgent module, and give the resulting pages to HTML::Parser.

Not mandatory, but the List::Compare module would be really handy as well.

Reply With Quote
  #3  
Old December 1st, 2012, 12:28 PM
mhsiao45 mhsiao45 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2007
Posts: 29 mhsiao45 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 56 m 26 sec
Reputation Power: 0
Hi Keith,

Definitely not myself. I took a C++ class 15 years ago, and thats about the extent of my programming knowledge

Are you or someone else you can recommend willing to help out?
Just let me know what you would charge, and I can talk to my boss and the rest of my team about it.

If you could PM me the contact details, i can contact you.
Thanks



Quote:
Originally Posted by keath
It's easy in perl, and in several other languages as well. A simple perl script could be launched from cron to check websites at whatever interval you wanted.

Are you looking to do the work yourself, or to hire someone? If you are doing this yourself, you'll want to start with LWP ; specifically the UserAgent module, and give the resulting pages to HTML::Parser.

Not mandatory, but the List::Compare module would be really handy as well.

Reply With Quote
  #4  
Old December 2nd, 2012, 09:37 AM
keath's Avatar
keath keath is offline
!~ /m$/
Dev Shed Specialist (4000 - 4499 posts)
 
Join Date: May 2004
Location: Reno, NV
Posts: 4,085 keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Weeks 4 Days 6 h 51 m 10 sec
Reputation Power: 1809
Someone else here may be interested, but I don't hire out for some reason I'm not sure of. I think there is probably a better forum, or another section of the forum for jobs like that.

The one thing to be aware of is that your competitors' websites will probably change their presentation from time to time, which could necessitate changes to a tracking script. In other words, it's the sort of script that is going to need occasional maintenance.

Reply With Quote
  #5  
Old December 6th, 2012, 05:54 AM
winblows winblows is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2012
Posts: 5 winblows User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 58 m 25 sec
Reputation Power: 0
Quote:
Originally Posted by keath
Someone else here may be interested, but I don't hire out for some reason I'm not sure of. I think there is probably a better forum, or another section of the forum for jobs like that.

The one thing to be aware of is that your competitors' websites will probably change their presentation from time to time, which could necessitate changes to a tracking script. In other words, it's the sort of script that is going to need occasional maintenance.


It you just wanted notification when the page changed, you could do the following:

1. Grab the web page via LWP::Simple

2. Calculate md5 checksum of the page
- Take into account any variables in the html that you may need to strip, such as headers with generation date/times in.

3. Compare against most recently stored checksum - if it differs, send an SMS or email to select recipients.

I can code this in Perl, PHP or C# if you're interested.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > Help Please -how to track Website changes?

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap