• frenchesco

    @ccc Thanks for your response. It worked well, except that I couldn't get it to work without displaying the WebView, which I was hoping to avoid. I ended up using the method suggested by @JonB to get around this.

    posted in Pythonista read more
  • frenchesco

    @JonB Thanks! That's exactly what I was looking for. For reference for anyone else that needs to do this in the future. Here's my final code:

    # coding: utf-8
    import ui
    import threading
    
    class Scraper (object):
        def __init__(self, url, js = 'document.documentElement.outerHTML'):
            self.wv = ui.WebView()
            self.wv.delegate = self
            self.wv.load_url(url)
            self.js = js
            self.response = ''
            self.ready_event = threading.Event()
            self.ready_event.wait()
        
        def webview_did_finish_load(self, webview):
            self.response = webview.eval_js(self.js)
            self.ready_event.set()
    
    def main():
        r = Scraper('https://www.google.com', 'document.title;').response
        print 'Response: ' + r
    
    if __name__ == '__main__':
        main()
    

    I basically got rid of the callback and just get the final response when the processing returns to the main thread as that's where I want to continue processing.

    posted in Pythonista read more
  • frenchesco

    I'm trying to implement a Web Scraper that uses ui.WebView to render a page and then return HTML from that page. The pages can be rendered with JavaScript so I can't use requests or mechanize like I normally would.

    The problem I have at the moment with my current implementation (below) is that I want the main thread to wait until the page has finished rendering and then return the HTML of that page back to the main thread. At the moment the main thread finishes executing before the page has finished loading.

    # coding: utf-8
    import ui
    
    class Scraper (object):
        def __init__(self, callback, url, js = 'document.documentElement.outerHTML'):
            self.wv = ui.WebView()
            self.wv.delegate = self
            self.wv.load_url(url)
            self.callback = callback
            self.js = js
        
        def webview_did_finish_load(self, webview):
            self.callback(webview.eval_js(self.js))
    
    # Example:
    def parse_response(response):
        print 'Webview finished loading - ' + response
    
    def main():
        s = Scraper(parse_response, 'https://www.google.com', 'document.title;')
        # How can I wait for the Web View to finish loading here and return the HTML of the webpage before proceeding on the main thread?
        print 'Main thread finished executing'
    if __name__ == '__main__':
        main()
    

    Does anyone know the best way of resolving this issue?

    posted in Pythonista read more

Internal error.

Oops! Looks like something went wrong!