Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 24th, 2012, 08:40 PM
Clarklight Clarklight is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 4 Clarklight User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 6 m 33 sec
Reputation Power: 0
Need Help Can't Install Beautifulsoup, Cant webscrape

I am brand new to coding, i need some coding for work/project, i hope people here can help me, i have read read and tried and tried i have no idea why i cant get my codes to work.

So here is my situation, i need to get some web scraping done, i can't even install beautifulsoup on my Mac. I follow everyone else instruction type Python setup.py install and what i got was : >>> python setup.py install
File "<stdin>", line 1
python setup.py install
^
SyntaxError: invalid syntax

I dont really know how to get it installed.

Then i have to scrape, data from this site:
drexel.bncollege.c 0m/webapp/wcs/stores/servlet/TBWizardView?catalogId=10001&storeId=31061&langId=-1
(have to change the o to 0 cos i can't post URL on post)


Every single Campus, every single Term, Every single Department,Every single Course .....Section.... Book and It's Price. Then i have to put these data into an excel file, under their own column heading.

I had an attempt to scrape but have totally no idea how to put into excel.
Below is what i wrote, i just use the codes from a tutorial, because i don't know where else i can

from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re

webpage = urlopen('(drexel.bncollege.c /webapp/wcs/stores/servlet/TBWizardView?catalogId=10001&storeId=31061&langId=-1').re
(I changed the URL above cos i can't post URL in post).

patFinderTitle = re.compile( '<title>(.*)</title>')
patFinderLink = re.compile( '<link rel.*href="(.*)"/>')

findPatTitle = re.findall(patFinderTitle,webpage)
findPatLink = re.findall(patFinderLink ,webpage)

listIterator = []
listIterator[:] = range(1,20)

for i in listIterator:
print findPatTitle[i]
print findPatLink[i]

articlePage = urlopen(findPatLink[i]).read()

divBegin = articlePage.find(<div align="left">Schedule for Winter Quarter 12-13</div>
article = articlePage[divBegindivBegin+1000)]

soup = BeautifulSoup(article)

paraglist = soup.findAll('p')

for i in paraglist:
print i
print "\n"

Thanks for your time and help

Reply With Quote
  #2  
Old October 24th, 2012, 09:51 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,361 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 10 h 11 sec
Reputation Power: 383
Don't know anything about beautiful soup, however....

You should run this program from the operating system shell, not from the python shell.


$ python setup.py install

or, if you happen to use a DOS computer,
A:> python setup.py install
__________________
[code]Code tags[/code] are essential for python code!

Reply With Quote
  #3  
Old October 24th, 2012, 11:49 PM
Clarklight Clarklight is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 4 Clarklight User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 6 m 33 sec
Reputation Power: 0
Quote:
Originally Posted by b49P23TIvg
Don't know anything about beautiful soup, however....

You should run this program from the operating system shell, not from the python shell.


$ python setup.py install

or, if you happen to use a DOS computer,
A:> python setup.py install


I ran it on terminal on Mac tho, it still gave me that. I have no idea what i suppose to do

Reply With Quote
  #4  
Old October 25th, 2012, 07:34 AM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,361 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 10 h 11 sec
Reputation Power: 383
$

That is your shell prompt.


>>>

This is a python prompt.


Running the python setup command in a terminal is correct.

Starting python first, and then giving the python setup command is wrong.




Correct procedure:
1)Open a terminal.
2)Change directory to that directory holding the beautiful soup setup.py file. Use the cd command.
3) python setup.py install

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Need Help Can't Install Beautifulsoup, Cant webscrape

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap