Parse xml node children list by tag with any prefix in python -


i got list of items, independently of prefixes. goal create method (please notice me if exist), has 1 argument(tagname) , returns list of elements.

for example in case of argument 'item' <media:item>, <abc:item> should part of result of function.

it nice use lxml can python dom-based parser.

unfortunatuly can't assume, xml has xmlns, that's why need parse prefix.

lxml option because has full support xpath version 1.0 via xpath() method besides many other useful utilities. , in xpath, can ignore element namespace using local-name() mentioned in comment.

lxml able deal undefined prefix setting parameter recover=true, comes catch; local-name() still return prefixed 'tagname' element having undefined prefix. there hacky way match kind of element, finding element local name contains :tagname -or more precise, find element local name ends with :tagname instead of contains-.

the following working example demo. demo uses 2 expressions combined logical operator or; 1 dealing element having undefined prefix, , other element without prefix or defined prefix :

from lxml import etree  xml = """<root foo="bar">     <media:item>a</media:item>     <abc:item>b</abc:item>     <foo:item>c</foo:item>     <item>d</item> </root>""" parser = etree.xmlparser(recover=true) tree = etree.fromstring(xml, parser=parser) tagname = "item" #expression match element undefined prefix predicate1 = "contains(local-name(),':{0}')".format(tagname) #expression match element defined prefix or no prefix predicate2 = "local-name()='{0}'".format(tagname) elements = tree.xpath("//*[{0} or {1}]".format(predicate1, predicate2)) e in elements:     print(etree.tostring(e)) 

output :

<media:item>a</media:item>  <abc:item>b</abc:item>  <foo:item>c</foo:item>  <item>d</item> 

Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -