omz:forum

    • Register
    • Login
    • Search
    • Recent
    • Popular

    Welcome!

    This is the community forum for my apps Pythonista and Editorial.

    For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.


    Open page source in Pythonista

    Pythonista
    3
    3
    2257
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Webmaster4o
      Webmaster4o last edited by

      There was a recent post about opening page source in Textastic. I don't own Textastic personally, so I wrote a script to open page source in Pythonista:

      import appex
      import urllib2
      from objc_util import *
      #Helper functions
      def openUrl(url):
      	'''Allows webbrowser.open()-esque functionality from the app extension'''
      	app=UIApplication.sharedApplication()
      	app._openURL_(nsurl(url))
      def getDocPath():
      	'''Gets the path to ~/Documents'''
      	split=__file__.split('/')
      	path=split[:split.index('Documents')+1]
      	return '/'.join(path)+'/'
      #Get the url	
      url=appex.get_url()
      #Read page contents
      f=urllib2.urlopen(url)
      source=f.read()
      f.close()
      #Detect the type of page we're viewing
      test=source.lower().strip()
      if '<html>' in test or test.startswith('<!doctype html>'): #Page is HTML
      	extension='.html'
      else: #fallback to .txt
      	extension='.txt'	
      #Where to save the source
      filename='source'+extension
      filepath=getPath()+filename
      #Save the source
      with open(filepath,'w') as f:
      	f.write(source)
      #Close appex window
      appex.finish()
      #Open in pythonista
      openUrl('pythonista://'+filename)
      

      It's under 50 lines so I can justify not putting it in a Gist for the time being :)

      1 Reply Last reply Reply Quote 3
      • omz
        omz last edited by omz

        Nice! I would suggest a different approach for detecting the content type though:

        # ...
        #Read page contents
        import requests
        r = requests.get(url)
        source = r.text
        ct = r.headers['Content-Type']
        # A fancier version could use the mimetypes module to guess the proper file extension...
        extension = '.html' if ct.startswith('text/html') else '.txt'
        # ...
        

        (I'm sure it's also possible to get the response headers with urllib2, I'm just more familiar with requests.)

        1 Reply Last reply Reply Quote 3
        • brumm
          brumm last edited by

          line28 = getDocPath

          1 Reply Last reply Reply Quote 2
          • First post
            Last post
          Powered by NodeBB Forums | Contributors