December 1st, 2012, 12:25 PM
Help Please -how to track Website changes?
I posted this question in PHP forum, and one of the members said if they had a preference, they would do this in PERL.
My question here is, would anyone be interested in helping develop this? How tough does this sound?
Ideally, I'd like to get some sort of email notice, whenever a certain change is made (removal of certain links).
ANOTHER MEMBERS APPROACH:
This would not necessarily need PHP. If I were doing this I would periodically grab the pages of interest and store them somewhere. I'd then do a diff on the latest page and the corresponding previously stored page. I would then analyze those diff's and make the appropriate notifications.
As a matter of personal preference I would use perl to do this.
HERE IS MY ORIGINAL QUESTION BELOW:
Sorry if I have posted this in the complete wrong place. I have no knowledge on web programming what so ever.
I was hoping someone could help out.
Is there a way to track certain changes to other websites?
I work for a manufacturers sales rep firm. Ideally, we want to rep the best manufacturers'. If you can imagine, the best are usually already taken, unless that manufacturer feels that rep firm isn't doing well for them. At that point, the rep firm gets dropped, and usually the rep firm will remove them from their website.
Thats where we want to take action and pick up these manufacturers while they are open. However, its very hard to find out without constantly checking every rep firms website. Either that, or just word of mouth.
It would be great if there was a way to track the removal of say, certain links (manufacturer's links) or images from a certain web page. Ideally, we would want to get notified when a change occurs, and then we can go see what link was removed.
For example, thsi rep firm lists all their Manufacturers here:
Say CREE Microwave is removed. We would want some way to be notified when that happens.
Any ideas on how to achieve this?
December 1st, 2012, 12:37 PM
It's easy in perl, and in several other languages as well. A simple perl script could be launched from cron to check websites at whatever interval you wanted.
Are you looking to do the work yourself, or to hire someone? If you are doing this yourself, you'll want to start with LWP ; specifically the UserAgent module, and give the resulting pages to HTML::Parser.
Not mandatory, but the List::Compare module would be really handy as well.
December 1st, 2012, 01:28 PM
Definitely not myself. I took a C++ class 15 years ago, and thats about the extent of my programming knowledge
Are you or someone else you can recommend willing to help out?
Just let me know what you would charge, and I can talk to my boss and the rest of my team about it.
If you could PM me the contact details, i can contact you.
Originally Posted by keath
December 2nd, 2012, 10:37 AM
Someone else here may be interested, but I don't hire out for some reason I'm not sure of. I think there is probably a better forum, or another section of the forum for jobs like that.
The one thing to be aware of is that your competitors' websites will probably change their presentation from time to time, which could necessitate changes to a tracking script. In other words, it's the sort of script that is going to need occasional maintenance.
December 6th, 2012, 06:54 AM
It you just wanted notification when the page changed, you could do the following:
Originally Posted by keath
1. Grab the web page via LWP::Simple
2. Calculate md5 checksum of the page
- Take into account any variables in the html that you may need to strip, such as headers with generation date/times in.
3. Compare against most recently stored checksum - if it differs, send an SMS or email to select recipients.
I can code this in Perl, PHP or C# if you're interested.