#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    22
    Rep Power
    0

    Is this possible?


    Hi everyone,

    Is it possible to construct a python program to scrape data off of a website and save it to different cells in a excel spreadsheet?

    Thanks,
    -Dennis
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Location
    Iran
    Posts
    149
    Rep Power
    139
    Hello there,


    I think yes it is possible. As an example (among other possibilities) you can take a look at pyExcelator package in order to generate Excel workbooks.

    As for extracting data, first of all, it depends on how the script connects to the website and also what is the format of the data. As an example (just to give an idea) take a look at the following:

    http://stackoverflow.com/questions/10807081/script-to-extract-data-from-web-page
    Regards,
    Dariyoosh
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    9
    Rep Power
    0
    The best way I have found is to use beautifulsoup to scrape the data then write the data to a .csv file.

    you can import the .csv file into excel and save it as a spread sheet.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    22
    Rep Power
    0

    Thanks


    Hi Guys/Gals,

    Thanks for the suggestions and input. I will take a look at these possibilities. It would save me a lot of time. What I am doing for my girlfriend is going to a business listing site and looking up business and entering there company info into a spreadsheet. So it would be excellant to have a program to help me with this! Thanks

    Dennis D.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    9
    Rep Power
    0
    I've done something similar myself in the past and beautiful soup is the way to go.

    Just remember not to be too brutal to the sites you are scraping. I never hit them more than once a second and I always try to obey the sites robots.txt file. It's just polite to do so.

    You also need to understand EXACTLY how the site is structured in order to get the information you want.
    After that it's just a case of writing out the information into a CSV file.


    Good luck.

IMN logo majestic logo threadwatch logo seochat tools logo