Python使用lxml库解析xml

2024-07-04 20 阅读1分钟

假设在Python中，已将“library.xml”的内容存入变量res中，

请用lxml库解析出所有标签的文本信息，只取年份信息并存入列表中。

from lxml import etree

tree = etree.parse('library.xml')
root = tree.getroot()
pubdate_years = []
for publisher in root.xpath('.//publisher/pubdate'):
# for publisher in root.xpath('//publisher[@id="p0021"]'):
    pubdate = publisher.text
    year = pubdate.split('-')[0]
    pubdate_years.append(year)

# print(pubdate_years)

要获取id="p0021"的节点内的所有文本信息，写出正确的XPath。

from lxml import etree

tree = etree.parse('library.xml')
root = tree.getroot()
pubdate_years = []
a = root.xpath('//publisher[@id="p0021"]//text()')
print(a)