print leaves console blank
being fairly new to python, but not to programming I have a problem with a script which runs beautiful on Windows python 2.7.11, but not in Pythonista on iOS. On iOS, although I can see by the length of the text I want to "print" to console, that there are roughly 2000 characters available, the console (or also tried, a text view) stays empty. Same for direct sys.stdout.write with flush. What is wrong?
import webbrowser import httplib import sys from HTMLParser import HTMLParser class MyHTMLParser(HTMLParser): t='' def handle_data(self, data): if len(data) > 40: self.t = self.t + '\n' + data def get_text(self): return self.t class textfromweb: def get_weather_text(self): text = '' hc=httplib.HTTPConnection('prognoza.hr:80', timeout=50) hc.connect() hc.request("GET","http://prognoza.hr/prognoze_e.php?id=jadran_n") r = hc.getresponse() data = r.read() parser = MyHTMLParser() parser.feed(data) self.text = parser.get_text() hc.close() return self.text t = textfromweb() print t.get_weather_text()
If you're trying to print a 20,000 character string, it could just be taking a while to appear. Can you successfully print a smaller string?
2107 characters ;-) Not 20,000 ;-)
I stupidly run this code. Straight after my iPad Pro started acting crazy. Is this a type of a attack at that url? If it's not I apologise.
It seems to be some kind of encoding issue. To be honest, I can't quite figure out why exactly this doesn't work, but you're making your life a little harder than it needs to be by using
If you use
requestsinstead, your code becomes much simpler, and it also happens to work:
from HTMLParser import HTMLParser import requests class MyHTMLParser(HTMLParser): t='' def handle_data(self, data): if len(data) > 40: self.t = self.t + '\n' + data def get_text(self): return self.t class textfromweb: def get_weather_text(self): r = requests.get('http://prognoza.hr/prognoze_e.php?id=jadran_n', timeout=50) data = r.text parser = MyHTMLParser() parser.feed(data) self.text = parser.get_text() return self.text t = textfromweb() print t.get_weather_text()
@Phuket2 There is no attack at this URL a.f.a.i.k.
@omz You helped me out of this trap, thanks...! About the encoding issue: It's a croatian website, I didn't care about the encoding they are using, yet...
.decode('latin1') fixes the problem too in the original code.
The text included several \xfc type umlats.
@JonB Ah, right, I mixed up
decodewhen I tried that. Happens to me all the time...
@omz the whole encode/decode is a mess that i can never get right. to_unicode and to_bytes would be so much more pythonic and clear!
@JonB @omz I can't do anything in that category at all… when I get encoding / decoding errors I usually just try to stick
x.decode('utf-8')wherever I can and hope it works 😛
I'm really usually at a loss when it comes to making programs that work with non-ASCII characters. It took me several months before I could make wikipedia map work with Unicode characters. If either of you knows any good resources or guides on that, I'd love to hear.
@pythonista , as I said I apologise if wrong. Which appears I am. Just after I ran your code, my ipad starting going crazy. It could have been water on my screen or something else. Just bad timing.
Again sorry, i just hard to ask the hard way, I am not good enough to know if I was under attack or not.
@omz @JonB @Webmaster4o If it helps, http://bit.ly/unipain is a good explanation of what Unicode and text encodings are, why they always go wrong, why Python 2 is weird and how to do Unicode properly in Python.
When I get encoding / decoding errors I usually just... switch to Python 3. ;-)
One thing to watch out for is that ALL HTMLParsers share a single
One piece of syntactic sugar could be to define a
HTMLParser.__str__()method in place of the
def __str__(self): return self.t # ... print(parser)
@dgelessus Your article describes my situation perfectly:
your program started belching UnicodeErrors. You kind of knew what to do with those, so you added an encode or a decode where the error was raised, but the UnicodeError happened somewhere else.