XML解析+模糊匹配

879 阅读1分钟

XML解析

  • 需要用到xml.etree.ElementTree
  • 假设xmlInfo为字符串,结构大致为
	<?xml version=\'1.0\' encoding=\'UTF-8\' standalone=\'yes\' ?>
    	<hierarchy rotation="0">
		 <node index="0" text=""  resource-id="android:id/content" class="android.widget.FrameLayout" package="com.amazon.tv.launcher" content-desc="" checkable="false" checked="false" clickable="false" enabled="true" focusable="false" focused="false" scrollable="false" long-clickable="false" password="false" selected="false" visible-to-user="true" bounds="[0,0][1920,1080]">\r\n
         <node index="8" text="Settings" resource-id="" class="android.widget.TextView" package="com.amazon.tv.launcher" content-desc="" checkable="false" checked="false" clickable="true" enabled="true" focusable="true" focused="false" scrollable="false" long-clickable="false" password="false" selected="false" visible-to-user="true" bounds="[1308,56][1484,112]" />\r\n                 
         </hierarchy>
  • 假设只关心node里面的text和某些特定的resource-id的内容,保存对应的内容。
	import xml.etree.ElementTree as ET
    
    def parseXml(self, xmlInfo):
        root = ET.fromstring(xmlInfo)
        words = []
        # Extract and output tags of interest
        for neighbor in root.iter('node'):
            if neighbor.attrib['text'] != '':
                words.append(neighbor.attrib['text'])
            if neighbor.attrib['class'] == 'android.widget.ViewAnimator'  or neighbor.attrib['class'] == 'android.widget.TextView':
                words.append(neighbor.attrib['resource-id'])
        return words

模糊匹配

待匹配字符串转换成模式串

  • 字典中的key为需要点击的按钮上的关键字
  • valuere.Pattern,为了方便re.compile,这样只需编译一次。
  • 如果value为空list,那在最后的匹配时,在每一个if中,都需要隐含的多做一次编译,因为findall(pattern, string)中,第一个参数必须是re.Pattern类型
  • keysplit去掉空格,变为list,里面存储关键字,传入convert_keyword_to_regx函数,变为正则式
	def makePattenString(self):
        self.word_dict = {
            "welcome connect": re.Pattern,
            "now successfully set up": re.Pattern,
            ....
            }
        log.info("word_dict is {}\n".format(self.word_dict))
        for key in self.word_dict.keys():
            word_list = key.split()
            self.word_dict[key] = re.compile(convert_keyword_to_regx(word_list))
        log.info("new word_dict is {}\n".format(self.word_dict))

将关键字转为正则式

  • list的每一个关键字处理成: (?i).*? word.*?
def convert_keyword_to_regx(keyword: list) -> str:
    if not isinstance(keyword, list):
        raise UIOperatorErr("Keywork must be a list of str")
    match_str = r"(?i).*?"
    for word in keyword:
        match_str = match_str + word + r".*?"
    return match_str

模式串与字符串匹配

  • for循环找字典中的数据是否与提取的xml信息匹配
  • words是提取的xml信息
  • word_dict是想要匹配的关键字
    makePattenString()
	words = parseXml(xmlInfo)
    
    for word in words:
          if self.word_dict["welcome connect"].findall(word):
                  //do something
                  break
          
          if self.word_dict["now successfully set up"].findall(word):
              //do something
              break
         ...
         ...