Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
Screen Scraping
-
What is screen scraping? From what I know it's like getting info from some database(i think). And also how can I screen scrape? Thanks for your answers
-
Screen Scraping is the art and science of:
1) getting all the text from a computer display (terminal, webpage, etc.) and then 2) selecting out only those data fields of interest for storage or further processing.
It used to be about getting data from terminal displays but these daze it is mostly about scraping data off of web pages. The Pythonista tools that I prefer for web scraping are
requests
(for getting all the HTML of a webpage) andbeautiful soup 4
(selecting out only those data fields of interest). bs4 is complicated but it is supercool once you get the hang of it.Here are two recent examples of web scraping. They follow the model:
import bs4, requests def get_beautiful_soup(url): return bs4.BeautifulSoup(requests.get(url).text) soup = get_beautiful_soup('http://omz-forums.appspot.com/pythonista') print(soup.prettify()) # See: http://www.crummy.com/software/BeautifulSoup/bs4/doc for all the things you can do with the soup.
As you can see by looking at the output, the harder part is selecting out only those data fields of interest. ;-)
If bs4 is too complicated for your purposes, you can do
html = requests.get(url).text
and then try usingstr.find()
andstr.partition()
or Python's regular expressions module,re
as a poor man's soup. Happy scraping. -
Cool! Thanks for the response
-
This post is deleted!