Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
Parsing YAML with Python
-
I have a simple Markdown file with YAML header that looks something like this:
--- title: "Mary had a little lamb" link: http://google.com ---
I want to extract some of the YAML data to use later in a workflow. So far I have this but it just gives me errors. Ideally once I extract the title key I would also like to strip the quotation marks but not sure how to do that yet either. Maybe there is an easier way?
#coding: utf-8 import console import yaml import editor from StringIO import StringIO text = StringIO(editor.get_text()) doc = list(yaml.load_all(text)) tweet_link = doc["link"] tweet_title = doc["title"] console.hud_alert(tweet_link)
-
Try this:
my_dict = yaml.load(editor.get_text())
-
@ccc said:
Try this:
my_dict = yaml.load(editor.get_text())
I got the error code "expected a single document, but found another document" pointing to the "---" as the culprits
-
my_dict = yaml.load(editor.get_text().replace('---', ''))
-
That worked great. For anyone following along this is what I ended up with. It checks to see whether or not the key 'link' exists before adding it to the clipboard as well. Thanks!
#coding: utf-8 import yaml import editor import clipboard m = yaml.load(editor.get_text().replace('-', '')) tweet = m['title'] if "link" in m: tweet = tweet + ' ' + m['link'] clipboard.set(tweet)
-
tweet += ' ' + m['link']
-
Oh nice. Thanks! First weekend using Python (if you couldn't tell)
And I spoke too soon. It worked fine if my text only had YAML front matter. If it has anything after the second '---' I get the "Error Scanner while scanning a block scalar ... " Error
I think this is because it expects the entire thing to be YAML. Anyway I can stop scanning up to the second ---.
-
str.partition() is your friend...
yaml_text = editor.get_text().rpartition('---')[0] yaml_dict = yaml.load(yaml_text.partition('---')[2] or yaml_text)
I would avoid short, nondescriptive names like
m
especially in your early days of programming. Instead, I would encourage you to use variable names that help you to know the origin or use of the data without having to write comments. This will accelerate your ability to write more complex logic. -
A markdown document might contain additional
---
for horizontal lines. I think this would break the code of @ccc. If the document is well formed and wrapping the YAML front matter in two---
the following line should work even with more---
in the text:yaml_dict = yaml.load(editor.get_text().partition('---')[2].partition('---')[0])
I'm a python novice myself, so I hope I'm not talking nonsense.
-
@Acky, I wanted to make sure it would work if there are no dividing lines. Between what you have written and what I have written, @jrh147 should be able to find a solution to fit the requirements.
- No horizontal lines, ccc works, Acky does not work
- 0ne horizontal line, ccc takes text before horizontal line, Acky takes text after horizontal line
- Two horizontal lines , ccc works, Acky works
- Thee or more horizontal lines, ccc takes all text between first and last horizontal lines, Acky takes all text between first and second lines.
-
Wow. Amazingly thorough help. Thank you so much. I hadn't thought of additional "---" but appreciate the code should that be a problem in the future. Everything works as expected so again, thank you.