@JonB Thanks! That's exactly what I was looking for. For reference for anyone else that needs to do this in the future. Here's my final code:
# coding: utf-8 import ui import threading class Scraper (object): def __init__(self, url, js = 'document.documentElement.outerHTML'): self.wv = ui.WebView() self.wv.delegate = self self.wv.load_url(url) self.js = js self.response = '' self.ready_event = threading.Event() self.ready_event.wait() def webview_did_finish_load(self, webview): self.response = webview.eval_js(self.js) self.ready_event.set() def main(): r = Scraper('https://www.google.com', 'document.title;').response print 'Response: ' + r if __name__ == '__main__': main()
I basically got rid of the callback and just get the final response when the processing returns to the main thread as that's where I want to continue processing.
I'm trying to implement a Web Scraper that uses
mechanizelike I normally would.
The problem I have at the moment with my current implementation (below) is that I want the main thread to wait until the page has finished rendering and then return the HTML of that page back to the main thread. At the moment the main thread finishes executing before the page has finished loading.
# coding: utf-8 import ui class Scraper (object): def __init__(self, callback, url, js = 'document.documentElement.outerHTML'): self.wv = ui.WebView() self.wv.delegate = self self.wv.load_url(url) self.callback = callback self.js = js def webview_did_finish_load(self, webview): self.callback(webview.eval_js(self.js)) # Example: def parse_response(response): print 'Webview finished loading - ' + response def main(): s = Scraper(parse_response, 'https://www.google.com', 'document.title;') # How can I wait for the Web View to finish loading here and return the HTML of the webpage before proceeding on the main thread? print 'Main thread finished executing' if __name__ == '__main__': main()
Does anyone know the best way of resolving this issue?