#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2007
    Posts
    29
    Rep Power
    0

    Help Please -how to track Website changes?


    Hi,

    I posted this question in PHP forum, and one of the members said if they had a preference, they would do this in PERL.
    My question here is, would anyone be interested in helping develop this? How tough does this sound?

    Ideally, I'd like to get some sort of email notice, whenever a certain change is made (removal of certain links).


    ANOTHER MEMBERS APPROACH:

    This would not necessarily need PHP. If I were doing this I would periodically grab the pages of interest and store them somewhere. I'd then do a diff on the latest page and the corresponding previously stored page. I would then analyze those diff's and make the appropriate notifications.

    As a matter of personal preference I would use perl to do this.



    HERE IS MY ORIGINAL QUESTION BELOW:



    Sorry if I have posted this in the complete wrong place. I have no knowledge on web programming what so ever.

    I was hoping someone could help out.

    Is there a way to track certain changes to other websites?

    I work for a manufacturers sales rep firm. Ideally, we want to rep the best manufacturers'. If you can imagine, the best are usually already taken, unless that manufacturer feels that rep firm isn't doing well for them. At that point, the rep firm gets dropped, and usually the rep firm will remove them from their website.

    Thats where we want to take action and pick up these manufacturers while they are open. However, its very hard to find out without constantly checking every rep firms website. Either that, or just word of mouth.

    It would be great if there was a way to track the removal of say, certain links (manufacturer's links) or images from a certain web page. Ideally, we would want to get notified when a change occurs, and then we can go see what link was removed.


    For example, thsi rep firm lists all their Manufacturers here:
    http://www.yando.com/caline.htm

    Say CREE Microwave is removed. We would want some way to be notified when that happens.

    Any ideas on how to achieve this?
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,263
    Rep Power
    1810
    It's easy in perl, and in several other languages as well. A simple perl script could be launched from cron to check websites at whatever interval you wanted.

    Are you looking to do the work yourself, or to hire someone? If you are doing this yourself, you'll want to start with LWP ; specifically the UserAgent module, and give the resulting pages to HTML::Parser.

    Not mandatory, but the List::Compare module would be really handy as well.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2007
    Posts
    29
    Rep Power
    0
    Hi Keith,

    Definitely not myself. I took a C++ class 15 years ago, and thats about the extent of my programming knowledge

    Are you or someone else you can recommend willing to help out?
    Just let me know what you would charge, and I can talk to my boss and the rest of my team about it.

    If you could PM me the contact details, i can contact you.
    Thanks



    Originally Posted by keath
    It's easy in perl, and in several other languages as well. A simple perl script could be launched from cron to check websites at whatever interval you wanted.

    Are you looking to do the work yourself, or to hire someone? If you are doing this yourself, you'll want to start with LWP ; specifically the UserAgent module, and give the resulting pages to HTML::Parser.

    Not mandatory, but the List::Compare module would be really handy as well.
  6. #4
  7. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,263
    Rep Power
    1810
    Someone else here may be interested, but I don't hire out for some reason I'm not sure of. I think there is probably a better forum, or another section of the forum for jobs like that.

    The one thing to be aware of is that your competitors' websites will probably change their presentation from time to time, which could necessitate changes to a tracking script. In other words, it's the sort of script that is going to need occasional maintenance.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    5
    Rep Power
    0
    Originally Posted by keath
    Someone else here may be interested, but I don't hire out for some reason I'm not sure of. I think there is probably a better forum, or another section of the forum for jobs like that.

    The one thing to be aware of is that your competitors' websites will probably change their presentation from time to time, which could necessitate changes to a tracking script. In other words, it's the sort of script that is going to need occasional maintenance.
    It you just wanted notification when the page changed, you could do the following:

    1. Grab the web page via LWP::Simple

    2. Calculate md5 checksum of the page
    - Take into account any variables in the html that you may need to strip, such as headers with generation date/times in.

    3. Compare against most recently stored checksum - if it differs, send an SMS or email to select recipients.

    I can code this in Perl, PHP or C# if you're interested.

IMN logo majestic logo threadwatch logo seochat tools logo