The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
can somebody help me make my script more efficient??
Discuss can somebody help me make my script more efficient?? in the Python Programming forum on Dev Shed. can somebody help me make my script more efficient?? Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

October 24th, 2003, 05:09 PM
|
|
Junior Member
|
|
Join Date: Jul 2003
Posts: 10
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
can somebody help me make my script more efficient??
I wrote this little script that goes to finance.yahoo.com and retrieves a stock quote.
Later this will be a function for a larger program, and it will need to be executed more often. It seems like it takes about half a second after inputing the symbol before it displays the info. I realize this may be due to lag, but I'm not sure how efficient my parsing is. I'm not very familiair with python, is there a more efficient way to parse the information out of the html??
Thanks,
Corey
Code:
#!/usr/bin/python
import urllib2
#Get ticker symbol and parse it into a URL
quote=raw_input("Please enter the stock's symbol:")
web_url = "http://finance.yahoo.com/q?s=" + quote
#Retrieve the webpage and store it into a list
list = (urllib2.urlopen(web_url)).readlines()
#puts the line containing Last Trade price into price_string
#puts the line containing Trade Time into time_string
for x in range(len(list)):
for i in range(len(list[x])):
if list[x][i:i+11]=="Last Trade:":
price_string = list[x]
if list[x][i:i+11]=="Trade Time:":
time_string = list[x]
#parses html out of price_string; puts it in quote
quote=""
for i in range(len( price_string)):
if price_string[i]=='<':
wtf=1
continue;
if price_string[i]=='>':
wtf=0
continue;
if wtf==0:
quote=quote+ price_string[i]
print quote
#parses html out of time_string; puts it in time
time=""
for i in range(len( time_string)):
if time_string[i]=='<':
html=1
continue;
if time_string[i]=='>':
html=0
continue;
if html==0:
time=time+ time_string[i]
print time
|

October 24th, 2003, 05:25 PM
|
|
Junior Member
|
|
Join Date: Sep 2003
Posts: 10
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
i've been working with Python for a bit and your commands look sound..but yet I am tried and there could be a xtra bit of code in there thats slowing it down :/ so my susgestion is see if you can get on a faster line and try it out there if its stilll too slow then debug it step by step removing and replaceing code 
|

October 24th, 2003, 05:26 PM
|
|
Junior Member
|
|
Join Date: Oct 2003
Location: Tucson AZ
Posts: 29
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
I'm not sure if it would be more efficient, as I don't know how Python handles it's resources... but you could download the URL into a file clean up the resources from accessing the web and then parse the file.
I don't know if it would be quicker because you have to store it in a file. However, it's possible that it might be faster as the amount of usage goes up. Again, it depends on how resources are handled.
|

October 24th, 2003, 05:26 PM
|
|
Junior Member
|
|
Join Date: Sep 2003
Posts: 10
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
oh ya and what vers of python are you running? maby that can make the difference as well......py 2.3.3 is the best vers by far ^_^
|

October 24th, 2003, 05:36 PM
|
|
Junior Member
|
|
Join Date: Jul 2003
Posts: 10
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
I'm on a fast connection and ping 40ms to finance.yahoo.com
I think urlopen does store it in a temporary file, then I put that into a list for parsing. The website is always chanign though and I always need the new info, so I can't cache it.
And I am using the newest version of python.
I made it so after it gets the last piece of info, which will always be below the other piece in the html code, it'll break out of the loop... but there's no discernable difference in the execution time. I'm pretty sure that the delay is mostly the time it takes their webserver to respond.
I could make it skips the first 100 or so lines(which is just css crap)... But mostly I'd just like to know out of curiosity if my parsing algorithim is efficient or not.
Thanks,
Corey
|

October 24th, 2003, 09:27 PM
|
|
Contributing User
|
|
Join Date: Oct 2003
Location: Canada
Posts: 185
Time spent in forums: 20 h 44 m 48 sec
Reputation Power: 0
|
|
|
Re: can somebody help me make my script more efficient??
Quote: Originally posted by A|pha_N3rd
I wrote this little script that goes to finance.yahoo.com and retrieves a stock quote.
Later this will be a function for a larger program, and it will need to be executed more often. It seems like it takes about half a second after inputing the symbol before it displays the info. I realize this may be due to lag, but I'm not sure how efficient my parsing is. I'm not very familiair with python, is there a more efficient way to parse the information out of the html??
Thanks,
Corey
Code:
#!/usr/bin/python
import urllib2
#Get ticker symbol and parse it into a URL
quote=raw_input("Please enter the stock's symbol:")
web_url = "http://finance.yahoo.com/q?s=" + quote
#Retrieve the webpage and store it into a list
list = (urllib2.urlopen(web_url)).readlines()
#puts the line containing Last Trade price into price_string
#puts the line containing Trade Time into time_string
for x in range(len(list)):
for i in range(len(list[x])):
if list[x][i:i+11]=="Last Trade:":
price_string = list[x]
if list[x][i:i+11]=="Trade Time:":
time_string = list[x]
#parses html out of price_string; puts it in quote
quote=""
for i in range(len( price_string)):
if price_string[i]=='<':
wtf=1
continue;
if price_string[i]=='>':
wtf=0
continue;
if wtf==0:
quote=quote+ price_string[i]
print quote
#parses html out of time_string; puts it in time
time=""
for i in range(len( time_string)):
if time_string[i]=='<':
html=1
continue;
if time_string[i]=='>':
html=0
continue;
if html==0:
time=time+ time_string[i]
print time
| There is nothing wrong with using someone elses code man. Just make sure you give credit to the person who first wrote it. In this case Inkdm posted that exact code on a different website. i suggest starting here to get the basics of python.
|

October 25th, 2003, 12:46 AM
|
|
Contributing User
|
|
Join Date: Jul 2003
Posts: 133
Time spent in forums: < 1 sec
Reputation Power: 10
|
|
I timed the different parts of your code. Don't worry about the loops, they are fast enough. It is the downloading of the webpage that takes time.
Code:
retrieve: 4.8
puts * 2: 0.18
parse price: 0.0005
parse time: 0.0005
|

October 25th, 2003, 12:54 PM
|
 |
Hello World :)
|
|
Join Date: Mar 2003
Location: Hull, UK
|
|
You can do the same thing in 3 lines using regular xxpressions  , whether or not this is faster i don't know.. but in theory it should/could be
Anyway here's my code (just to promote the power of regex  ), you might have to make a few changes to it inorder to get the formatting you want though
Code:
#!/usr/bin/env python
import re, urllib
code = raw_input('Please enter the stock\'s symbol:')
tokens = re.compile('(Trade Time:.+?|Last Trade:.+?)\n')
source = urllib.urlopen('http://finance.yahoo.com/q?s=%s' % code).read()
source = tokens.findall(re.sub('<.+?>', '', source))
print source
Will give you a list containing two values 'Trade Time:value' and 'Last Trade:value', with a few little changes to the regex you should be able to get any of the other values too!
Mark.
__________________
programming language development: www.netytan.com – Hula
|

October 26th, 2003, 03:14 AM
|
|
Contributing User
|
|
Join Date: Jul 2003
Posts: 133
Time spent in forums: < 1 sec
Reputation Power: 10
|
|
|
The problem with regexes is that they're slow. It's almost always faster to parse something in another manner, it's just that regexes are more powerful and easier. Don't use them in this case.
|

October 26th, 2003, 04:23 AM
|
|
Junior Member
|
|
Join Date: Jul 2003
Posts: 10
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
 If you're going to accuse me of plagerising of a few lines of crappy code, at least cite the original. I can assure you that wrote it myself...
percivall, thanks. Executing the line that fetches the website 20 times takes ~10 seconds for me. If I didn't have to wait for the first one to be done it'd be faster... How do I execute 2 lines of code at the same time??
netytan, wow that's amazing! it took me 30 minutes of sorting through webpages to figure out how to do it, and you did it in 3 lines  I think I'm going to learn more about python 
|

October 26th, 2003, 04:59 AM
|
|
Contributing User
|
|
Join Date: Jul 2003
Posts: 133
Time spent in forums: < 1 sec
Reputation Power: 10
|
|
|
(continued from my last post) ... Though in this case, the lesser speed of regexes will make absolutely no difference, so use them.
|

October 26th, 2003, 08:39 AM
|
 |
Hello World :)
|
|
Join Date: Mar 2003
Location: Hull, UK
|
|
Thanks  . At first glance your code seems to be really  OMG lol, but it actually makes allot of sence if u sit down and read it (of course having the page source infront of you does help allot  )
I'm a big Regex fan, they are definatly one of the most powerful tool to have in a language IMO! I'd hate to site down and write a parser from scratch every time i want something parsing! especially since webpages change from time to time
Quote: |
The problem with regexes is that they're slow. It's almost always faster to parse something in another manner, it's just that regexes are more powerful and easier... |
Not sure about this one perc, i don't really see how doing using the Pyhon's re module could be slower than parsing a webpage with multiple for loops (not that these are slow!) baring in mind that the re module is written in C/C++. Of course you have the added import time, but how efficent is that!
Maybe you would time the two script if you can? Oh just out of interest, what are you using for this, pystone?
Edit: Infinite, where on Python 2.3.2  .. 2.3.3 hasn't been released yet dude
Have fun guys,
Mark.
Last edited by netytan : October 26th, 2003 at 09:04 AM.
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|