|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
Be the architects of evolution and help create the mobile internet future. It’s your move---enter to win here! |
|
#1
|
|||
|
|||
|
can somebody help me make my script more efficient??
I wrote this little script that goes to finance.yahoo.com and retrieves a stock quote.
Later this will be a function for a larger program, and it will need to be executed more often. It seems like it takes about half a second after inputing the symbol before it displays the info. I realize this may be due to lag, but I'm not sure how efficient my parsing is. I'm not very familiair with python, is there a more efficient way to parse the information out of the html?? Thanks, Corey Code:
#!/usr/bin/python
import urllib2
#Get ticker symbol and parse it into a URL
quote=raw_input("Please enter the stock's symbol:")
web_url = "http://finance.yahoo.com/q?s=" + quote
#Retrieve the webpage and store it into a list
list = (urllib2.urlopen(web_url)).readlines()
#puts the line containing Last Trade price into price_string
#puts the line containing Trade Time into time_string
for x in range(len(list)):
for i in range(len(list[x])):
if list[x][i:i+11]=="Last Trade:":
price_string = list[x]
if list[x][i:i+11]=="Trade Time:":
time_string = list[x]
#parses html out of price_string; puts it in quote
quote=""
for i in range(len( price_string)):
if price_string[i]=='<':
wtf=1
continue;
if price_string[i]=='>':
wtf=0
continue;
if wtf==0:
quote=quote+ price_string[i]
print quote
#parses html out of time_string; puts it in time
time=""
for i in range(len( time_string)):
if time_string[i]=='<':
html=1
continue;
if time_string[i]=='>':
html=0
continue;
if html==0:
time=time+ time_string[i]
print time
|
|
#2
|
|||
|
|||
|
i've been working with Python for a bit and your commands look sound..but yet I am tried and there could be a xtra bit of code in there thats slowing it down :/ so my susgestion is see if you can get on a faster line and try it out there if its stilll too slow then debug it step by step removing and replaceing code
![]() |
|
#3
|
|||
|
|||
|
I'm not sure if it would be more efficient, as I don't know how Python handles it's resources... but you could download the URL into a file clean up the resources from accessing the web and then parse the file.
I don't know if it would be quicker because you have to store it in a file. However, it's possible that it might be faster as the amount of usage goes up. Again, it depends on how resources are handled. |
|
#4
|
|||
|
|||
|
oh ya and what vers of python are you running? maby that can make the difference as well......py 2.3.3 is the best vers by far ^_^
|
|
#5
|
|||
|
|||
|
I'm on a fast connection and ping 40ms to finance.yahoo.com
I think urlopen does store it in a temporary file, then I put that into a list for parsing. The website is always chanign though and I always need the new info, so I can't cache it. And I am using the newest version of python. I made it so after it gets the last piece of info, which will always be below the other piece in the html code, it'll break out of the loop... but there's no discernable difference in the execution time. I'm pretty sure that the delay is mostly the time it takes their webserver to respond. I could make it skips the first 100 or so lines(which is just css crap)... But mostly I'd just like to know out of curiosity if my parsing algorithim is efficient or not. Thanks, Corey |
|
#6
|
|||
|
|||
|
Re: can somebody help me make my script more efficient??
Quote:
|
|
#7
|
|||
|
|||
|
I timed the different parts of your code. Don't worry about the loops, they are fast enough. It is the downloading of the webpage that takes time.
Code:
retrieve: 4.8 puts * 2: 0.18 parse price: 0.0005 parse time: 0.0005 |
|
#8
|
||||
|
||||
|
You can do the same thing in 3 lines using regular xxpressions
, whether or not this is faster i don't know.. but in theory it should/could beAnyway here's my code (just to promote the power of regex ), you might have to make a few changes to it inorder to get the formatting you want though ![]() Code:
#!/usr/bin/env python
import re, urllib
code = raw_input('Please enter the stock\'s symbol:')
tokens = re.compile('(Trade Time:.+?|Last Trade:.+?)\n')
source = urllib.urlopen('http://finance.yahoo.com/q?s=%s' % code).read()
source = tokens.findall(re.sub('<.+?>', '', source))
print source
Will give you a list containing two values 'Trade Time:value' and 'Last Trade:value', with a few little changes to the regex you should be able to get any of the other values too! Mark. |
|
#9
|
|||
|
|||
|
The problem with regexes is that they're slow. It's almost always faster to parse something in another manner, it's just that regexes are more powerful and easier. Don't use them in this case.
|
|
#10
|
|||
|
|||
If you're going to accuse me of plagerising of a few lines of crappy code, at least cite the original. I can assure you that wrote it myself...percivall, thanks. Executing the line that fetches the website 20 times takes ~10 seconds for me. If I didn't have to wait for the first one to be done it'd be faster... How do I execute 2 lines of code at the same time?? netytan, wow that's amazing! it took me 30 minutes of sorting through webpages to figure out how to do it, and you did it in 3 lines I think I'm going to learn more about python ![]() |
|
#11
|
|||
|
|||
|
(continued from my last post) ... Though in this case, the lesser speed of regexes will make absolutely no difference, so use them.
|
|
#12
|
||||
|
||||
|
Thanks
. At first glance your code seems to be really OMG lol, but it actually makes allot of sence if u sit down and read it (of course having the page source infront of you does help allot )I'm a big Regex fan, they are definatly one of the most powerful tool to have in a language IMO! I'd hate to site down and write a parser from scratch every time i want something parsing! especially since webpages change from time to time ![]() Quote:
Not sure about this one perc, i don't really see how doing using the Pyhon's re module could be slower than parsing a webpage with multiple for loops (not that these are slow!) baring in mind that the re module is written in C/C++. Of course you have the added import time, but how efficent is that! Maybe you would time the two script if you can? Oh just out of interest, what are you using for this, pystone? Edit: Infinite, where on Python 2.3.2 .. 2.3.3 hasn't been released yet dude ![]() Have fun guys, Mark. Last edited by netytan : October 26th, 2003 at 09:04 AM. |
![]() |
| Viewing: Dev Shed Forums > Programming Languages > Python Programming > can somebody help me make my script more efficient?? |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|