Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
BeautifulSoup XML
-
For some reason, bs4 is giving a "FeatureNotFound" error when trying to parse xml documents.
This happens with both BeautifulSoup(markdown,'xml') and BeautifulStoneSoup(markdown).
Is this just because lxml is not installed? Is there any way to get BeautifulSoup to use a different xml parser?
Thanks.
-
Why don't you use the
xml
module to parse xml? ;) BeautifulSoup is specifically for HTML parsing, and although it may parse XML in many cases, it's not built to do that, so it won't work perfectly. -
BeautifulSoup isn't a HTML parser (I think), it's a tool for working with a HTML document. By passing it the
"xml"
argument it can be switched into XML mode, which (among other things) means that a different parser needs to be used. You are right, the BeautifulSoup objects have some similarities to thexml.etree.ElementTree
API, but the best thatElementTree
can do is recursive searching - with BeautifulSoup you can e. g. match based on a tag's attributes. -
Yeah, I know I could use xml directly or xmltodict, but prefer BeautifulSoup's interface (and have a bunch of useful scripts I've already written that use it).
BeautifulSoup DOES support XML, but it needs a parser (think it might only support lxml now).
-
Yet another reason why I'd like to get
lxml
included in the next release of Pythonista! -
Add lxml plz.