For some reason, bs4 is giving a "FeatureNotFound" error when trying to parse xml documents.
This happens with both BeautifulSoup(markdown,'xml') and BeautifulStoneSoup(markdown).
Is this just because lxml is not installed? Is there any way to get BeautifulSoup to use a different xml parser?
Why don't you use the
xmlmodule to parse xml? ;) BeautifulSoup is specifically for HTML parsing, and although it may parse XML in many cases, it's not built to do that, so it won't work perfectly.
BeautifulSoup isn't a HTML parser (I think), it's a tool for working with a HTML document. By passing it the
"xml"argument it can be switched into XML mode, which (among other things) means that a different parser needs to be used. You are right, the BeautifulSoup objects have some similarities to the
xml.etree.ElementTreeAPI, but the best that
ElementTreecan do is recursive searching - with BeautifulSoup you can e. g. match based on a tag's attributes.
Yeah, I know I could use xml directly or xmltodict, but prefer BeautifulSoup's interface (and have a bunch of useful scripts I've already written that use it).
BeautifulSoup DOES support XML, but it needs a parser (think it might only support lxml now).
Yet another reason why I'd like to get
lxmlincluded in the next release of Pythonista!