Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
urllib.request - Python 2 vs 3
-
I am trying to get a script (GoogleQuote) which uses urllib to work under Python 3 (it works fine in Python2.7) but the same url seems to give an error in Python3.
Specifically here is the script in Python2:
import urllib import time,datetime class Quote(object): DATE_FMT = '%Y-%m-%d' TIME_FMT = '%H:%M:%S' def __init__(self): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) def append(self,dt,open_,high,low,close,volume): self.date.append(dt.date()) self.time.append(dt.time()) self.open_.append(float(open_)) self.high.append(float(high)) self.low.append(float(low)) self.close.append(float(close)) self.volume.append(int(volume)) def to_csv(self): return ''.join(["{0},{1},{2},{3:.2f},{4:.2f},{5:.2f},{6:.2f},{7}\n".format(self.symbol, self.date[bar].strftime('%Y-%m-%d'),self.time[bar].strftime('%H:%M:%S'), self.open_[bar],self.high[bar],self.low[bar],self.close[bar],self.volume[bar]) for bar in xrange(len(self.close))]) def write_csv(self,filename): with open(filename,'w') as f: f.write(self.to_csv()) def read_csv(self,filename): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) for line in open(filename,'r'): symbol,ds,ts,open_,high,low,close,volume = line.rstrip().split(',') self.symbol = symbol dt = datetime.datetime.strptime(ds+' '+ts,self.DATE_FMT+' '+self.TIME_FMT) self.append(dt,open_,high,low,close,volume) return True def __repr__(self): return self.to_csv() class GoogleQuote(Quote): ''' Daily quotes from Google. Date format='yyyy-mm-dd' ''' def __init__(self,symbol,start_date,end_date=datetime.date.today().isoformat()): super(GoogleQuote,self).__init__() self.symbol = symbol.upper() start = datetime.date(int(start_date[0:4]),int(start_date[5:7]),int(start_date[8:10])) end = datetime.date(int(end_date[0:4]),int(end_date[5:7]),int(end_date[8:10])) url_string = "http://www.google.com/finance/historical?q={0}".format(self.symbol) url_string += "&startdate={0}&enddate={1}&output=csv".format( start.strftime('%b %d, %Y'),end.strftime('%b %d, %Y')) print url_string csv = urllib.urlopen(url_string).readlines() csv.reverse() for bar in xrange(0,len(csv)-1): ds,open_,high,low,close,volume = csv[bar].rstrip().split(',') open_,high,low,close = [float(x) for x in [open_,high,low,close]] dt = datetime.datetime.strptime(ds,'%d-%b-%y') self.append(dt,open_,high,low,close,volume) if __name__ == '__main__': q = GoogleQuote('aapl','2011-01-01','2011-01-03') # download year to date Apple data print(q) # print it out
If you run this, you will see the url_string that is passed and the result (AAPL prices for a few days in 2011)
Here is the Python3 version of the same script:
import urllib.request import time,datetime class Quote(object): DATE_FMT = '%Y-%m-%d' TIME_FMT = '%H:%M:%S' def __init__(self): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) def append(self,dt,open_,high,low,close,volume): self.date.append(dt.date()) self.time.append(dt.time()) self.open_.append(float(open_)) self.high.append(float(high)) self.low.append(float(low)) self.close.append(float(close)) self.volume.append(int(volume)) def to_csv(self): return ''.join(["{0},{1},{2},{3:.2f},{4:.2f},{5:.2f},{6:.2f},{7}\n".format(self.symbol, self.date[bar].strftime('%Y-%m-%d'),self.time[bar].strftime('%H:%M:%S'), self.open_[bar],self.high[bar],self.low[bar],self.close[bar],self.volume[bar]) for bar in xrange(len(self.close))]) def write_csv(self,filename): with open(filename,'w') as f: f.write(self.to_csv()) def read_csv(self,filename): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) for line in open(filename,'r'): symbol,ds,ts,open_,high,low,close,volume = line.rstrip().split(',') self.symbol = symbol dt = datetime.datetime.strptime(ds+' '+ts,self.DATE_FMT+' '+self.TIME_FMT) self.append(dt,open_,high,low,close,volume) return True def __repr__(self): return self.to_csv() class GoogleQuote(Quote): ''' Daily quotes from Google. Date format='yyyy-mm-dd' ''' def __init__(self,symbol,start_date,end_date=datetime.date.today().isoformat()): super(GoogleQuote,self).__init__() self.symbol = symbol.upper() start = datetime.date(int(start_date[0:4]),int(start_date[5:7]),int(start_date[8:10])) end = datetime.date(int(end_date[0:4]),int(end_date[5:7]),int(end_date[8:10])) url_string = "http://www.google.com/finance/historical?q={0}".format(self.symbol) url_string += "&startdate={0}&enddate={1}&output=csv".format( start.strftime('%b %d, %Y'),end.strftime('%b %d, %Y')) print(url_string) req = urllib.request.Request(url_string) csv = urllib.request.urlopen(req).readlines() csv.reverse() for bar in xrange(0,len(csv)-1): ds,open_,high,low,close,volume = csv[bar].rstrip().split(',') open_,high,low,close = [float(x) for x in [open_,high,low,close]] dt = datetime.datetime.strptime(ds,'%d-%b-%y') self.append(dt,open_,high,low,close,volume) if __name__ == '__main__': q = GoogleQuote('aapl','2011-01-01','2011-01-03') # download year to date Apple data print(q) # print it out
When I run this I get an HTTP 400 error. Why does the same url_string produce different results. What have I missed?
-
have you printed out url_string to see that it is, indeed, exactly the same?
-
Yes, if you run the posted scripts they do just that.
-
FWIW, the reason I want this to run under Python3 is that I'd like to use it as part of an IOS widget which requires Python3.
-
I thought the problem might be the user-agent in the header that urllib.request sends (different in Python 3? I made a few changes to add a user-agent string but the problem persists:
#!python3 import urllib.request import time,datetime class Quote(object): DATE_FMT = '%Y-%m-%d' TIME_FMT = '%H:%M:%S' def __init__(self): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) def append(self,dt,open_,high,low,close,volume): self.date.append(dt.date()) self.time.append(dt.time()) self.open_.append(float(open_)) self.high.append(float(high)) self.low.append(float(low)) self.close.append(float(close)) self.volume.append(int(volume)) def to_csv(self): return ''.join(["{0},{1},{2},{3:.2f},{4:.2f},{5:.2f},{6:.2f},{7}\n".format(self.symbol, self.date[bar].strftime('%Y-%m-%d'),self.time[bar].strftime('%H:%M:%S'), self.open_[bar],self.high[bar],self.low[bar],self.close[bar],self.volume[bar]) for bar in range(len(self.close))]) def write_csv(self,filename): with open(filename,'w') as f: f.write(self.to_csv()) def read_csv(self,filename): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) for line in open(filename,'r'): symbol,ds,ts,open_,high,low,close,volume = line.rstrip().split(',') self.symbol = symbol dt = datetime.datetime.strptime(ds+' '+ts,self.DATE_FMT+' '+self.TIME_FMT) self.append(dt,open_,high,low,close,volume) return True def __repr__(self): return self.to_csv() class GoogleQuote(Quote): ''' Daily quotes from Google. Date format='yyyy-mm-dd' ''' def __init__(self,symbol,start_date,end_date=datetime.date.today().isoformat()): super(GoogleQuote,self).__init__() self.symbol = symbol.upper() start = datetime.date(int(start_date[0:4]),int(start_date[5:7]),int(start_date[8:10])) end = datetime.date(int(end_date[0:4]),int(end_date[5:7]),int(end_date[8:10])) url_string = "http://www.google.com/finance/historical?q={0}".format(self.symbol) url_string += "&startdate={0}&enddate={1}&output=csv".format( start.strftime('%b %d, %Y'),end.strftime('%b %d, %Y')) print(url_string) user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1' headers = {'User-Agent': user_agent} req = urllib.request.Request(url_string,data=None,headers=headers) csv = urllib.request.urlopen(req).readlines() csv.reverse() for bar in range(0,len(csv)-1): ds,open_,high,low,close,volume = csv[bar].rstrip().split(',') open_,high,low,close = [float(x) for x in [open_,high,low,close]] dt = datetime.datetime.strptime(ds,'%d-%b-%y') self.append(dt,open_,high,low,close,volume) if __name__ == '__main__': q = GoogleQuote('aapl','2011-01-01','2011-01-03') # download year to date Apple data print(q) # print it out
-
https://gist.github.com/cd0c1833e328f65a815f00684ce6274a
There were a few issues. urlopen didn't like the spaces -- it is better to use urlencode when you can.
also, csv was bytes, so you had to decode to a str.some of the issues could have been avoided if you used
requests
, which handles a lot of that for you, but maybe you are trying to stay lightweight to use the widget? -
I agree with @JonB that you should always consider using requests instead of urllib as it allows you to write tighter code and it solves so many frustrating corner cases so that you do not have to.
Some thoughts on data formatting... You are not using DATE_FMT and TIME_FMT so you can lose them. You could create a FMT in their place and use one of the following two approaches:
import datetime from collections import namedtuple now = datetime.datetime.now() # enhance the fmt string to avoid calls to strftime FMT = '{},{:%Y-%m-%d},{:%H:%M:%S},{:.2f},{:.2f},{:.2f},{:.2f},{}' print(FMT.format('AAPL', now, now, 1, 2, 3, 4, 5)) # Or define a namedtuple and a format to match stock_info = namedtuple('stock_info', 'date time open high low close volume') stock_record = stock_info(now, now, 1, 2, 3, 4, 5) FMT = ('{},{date:%Y-%m-%d},{time:%H:%M:%S},{open:.2f},{high:.2f},{low:.2f},' '{close:.2f},{volume}') print(FMT.format('AAPL', **stock_record._asdict()))
The namedtuple works well for historical data which should not be modified after it created and Python's tuple is a compact data structure that does not allow modifications.
-
@jonb and @ccc Thank you for the suggested improvements. The original code was not mine and for whatever reason the author decided on urllib rather than requests. Ystockquote stopped providing historical quotes so this code is meant to fill in that deficiency.
@JonB Thank you!! for taking the time to provide a working script. It is now installed as part of an IOS widget.
-
@jonb I have been using your code under Python 3 without any problems; however now I am having a problem with the Python 2 version which was also working until today. Unfortunately I need both (Python 3 for the IOS widget and Python 2 for a webpage). Below is the Python 2 version of GoogleQuote that I have been using. When I run it, I get a string conversion error for the ticker symbol "VTI" but if I use "ADBE" it works although it gives me a year's worth of data instead of 3 days. All of this was working fine and it still works fine in the Python 3 version! Color me confused.
import urllib,time,datetime class Quote(object): DATE_FMT = '%Y-%m-%d' TIME_FMT = '%H:%M:%S' def __init__(self): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) def append(self,dt,open_,high,low,close,volume): self.date.append(dt.date()) self.time.append(dt.time()) self.open_.append(float(open_)) self.high.append(float(high)) self.low.append(float(low)) self.close.append(float(close)) self.volume.append(int(volume)) def to_csv(self): return ''.join(["{0},{1},{2},{3:.2f},{4:.2f},{5:.2f},{6:.2f},{7}\n".format(self.symbol, self.date[bar].strftime('%Y-%m-%d'),self.time[bar].strftime('%H:%M:%S'), self.open_[bar],self.high[bar],self.low[bar],self.close[bar],self.volume[bar]) for bar in xrange(len(self.close))]) def write_csv(self,filename): with open(filename,'w') as f: f.write(self.to_csv()) def read_csv(self,filename): self.symbol = '' self.date,self.time,self.open_,self.high,self.low,self.close,self.volume = ([] for _ in range(7)) for line in open(filename,'r'): symbol,ds,ts,open_,high,low,close,volume = line.rstrip().split(',') self.symbol = symbol dt = datetime.datetime.strptime(ds+' '+ts,self.DATE_FMT+' '+self.TIME_FMT) self.append(dt,open_,high,low,close,volume) return True def __repr__(self): return self.to_csv() class GoogleQuote(Quote): ''' Daily quotes from Google. Date format='yyyy-mm-dd' ''' def __init__(self,symbol,start_date,end_date=datetime.date.today().isoformat()): super(GoogleQuote,self).__init__() self.symbol = symbol.upper() start = datetime.date(int(start_date[0:4]),int(start_date[5:7]),int(start_date[8:10])) end = datetime.date(int(end_date[0:4]),int(end_date[5:7]),int(end_date[8:10])) url_string = "http://www.google.com/finance/historical?q={0}".format(self.symbol) url_string += "&startdate={0}&enddate={1}&output=csv".format( start.strftime('%b %d, %Y'),end.strftime('%b %d, %Y')) csv = urllib.urlopen(url_string).readlines() csv.reverse() for bar in xrange(0,len(csv)-1): ds,open_,high,low,close,volume = csv[bar].rstrip().split(',') open_,high,low,close = [float(x) for x in [open_,high,low,close]] dt = datetime.datetime.strptime(ds,'%d-%b-%y') self.append(dt,open_,high,low,close,volume) #if __name__ == '__main__': q = GoogleQuote('vti','2017-09-08','2017-09-12') print q
-
@jonb I really need some help with this. If I use this URL https://finance.google.com/finance/historical?q=vti&startdate=Sep+19%2C+2017&enddate=Sep+19%2C+2017&num=30&output=csv in a mac browser I get a CSV file with data for Sep 19. However if I open the same URL on the iPad I get an error. I think that Google Finance is now making it difficult to get historical quotes but I think there must be some way to do it especially given the behavior on the Mac. Any help you can provide to get this script (or any script) that can return a price for a given date would be greatly appreciated.
-
Try getting rid of &num=30
-
With or without the num=30, I cannot get this to work. The script posted above generates a url without that parameter and it fails with “could not convert string to float”, presumably because the url is no longer returning useful data. It worked fine after your changes but.no more.
-
Problem seems to be the url should be finance.google.com/finance, not google.com/finance
As an aside, the date format can be yyyymmdd, or yyyy-mm-dd, so no need to format it at all from what was originall entered...
https://gist.github.com/548bc9c2082f1a085a87c20a0b06e2f6 -
I guess they changed the URL. They also removed the link to the historical prices from their web page. However, the script is working again ( at least for the time being).
Thank you!