解析XML数据:使用xml2js库轻松进行XML解析

274 阅读6分钟

解析XML文件是开发中常见的需求之一。为了以一种简单易用的方式访问XML数据,你可能不想编译一个C解析器,而是想寻找一个更方便的解决方案。那么,xml2js就是你需要的工具!xml2js是一个简单的XML到JavaScript对象转换器,支持双向转换。它使用了sax-js和xmlbuilder-js这两个工具库。

安装

最简单的安装xml2js的方式是使用npm。只需执行npm install xml2js,npm将会下载xml2js及其所有依赖。xml2js也可以通过Bower进行安装,只需执行bower install xml2js,Bower将会下载xml2js及其所有依赖。

用法

由于你是一位非常聪明的开发者,所以不需要过多的教程指导。解析XML应该是一件简单的事情,我们直接通过一些示例进行学习。

简短用法

你希望尽可能简单、轻松地解析XML吗?使用以下代码:

var parseString = require('xml2js').parseString;
          var xml = "<root>Hello xml2js!</root>"
          parseString(xml, function (err, result) {
              console.dir(result);
              });

这真是再简单不过了,对吧?这个方法适用于xml2js的0.2.3版本及以上。使用CoffeeScript时,代码如下:

CoffeeScript 可能比较小众,但是官方文档有举例了相关内容,故本文进行保留。
{parseString} = require 'xml2js'
          xml = "<root>Hello xml2js!</root>"
          parseString xml, (err, result) ->
              console.dir result

如果你需要一些特殊的选项,也不用担心,xml2js支持许多选项(见下文),你可以将这些选项作为第二个参数进行指定:

parseString(xml, {trim: true}, function (err, result) {
          });

实例方法

如果你之前一直使用xml-simple或者自己封装的方法来处理XML,那么从0.1.11版本开始,xml2js为你添加了以下方法:

                      var parser = new xml2js.Parser();
                      fs.readFile(__dirname + '/foo.xml', function(err, data) {
                          parser.parseString(data, function (err, result) {
                                  console.dir(result);
                                          console.log('Done');
                                              });
                                              });" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">var</span> fs = <span class="hljs-built_in">require</span>(<span class="hljs-string">'fs'</span>),
          xml2js = <span class="hljs-built_in">require</span>(<span class="hljs-string">'xml2js'</span>);
          
          <span class="hljs-keyword">var</span> parser = <span class="hljs-keyword">new</span> xml2js.<span class="hljs-title class_">Parser</span>();
          fs.<span class="hljs-title function_">readFile</span>(__dirname + <span class="hljs-string">'/foo.xml'</span>, <span class="hljs-keyword">function</span>(<span class="hljs-params">err, data</span>) {
              parser.<span class="hljs-title function_">parseString</span>(data, <span class="hljs-keyword">function</span> (<span class="hljs-params">err, result</span>) {
                      <span class="hljs-variable language_">console</span>.<span class="hljs-title function_">dir</span>(result);
                              <span class="hljs-variable language_">console</span>.<span class="hljs-title function_">log</span>(<span class="hljs-string">'Done'</span>);
                                  });
                                  });</pre><p>看吧,没有事件监听器!</p><p>你还可以使用<a target="_blank" href="https://link.segmentfault.com/?enc=64CJgkgNv%2FHcqakFhQdNRw%3D%3D.pdeyja27cdJxnadxlJiRGBEqxQCXIlkWqjF5l8wAOFzcQAOxS3nXWpEHHBFmePi1">CoffeeScript</a>来进一步减少代码的冗余:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="fs = require 'fs',
                  xml2js = require 'xml2js'
                  
                  parser = new xml2js.Parser()
                  fs.readFile __dirname + '/foo.xml', (err, data) ->
                    parser.parseString data, (err, result) ->
                        console.dir result
                            console.log 'Done.'" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="coffeescript hljs language-coffeescript">fs = <span class="hljs-built_in">require</span> <span class="hljs-string">'fs'</span>,
      xml2js = <span class="hljs-built_in">require</span> <span class="hljs-string">'xml2js'</span>
      
      parser = <span class="hljs-keyword">new</span> xml2js.Parser()
      fs.readFile __dirname + <span class="hljs-string">'/foo.xml'</span>, <span class="hljs-function"><span class="hljs-params">(err, data)</span> -&gt;</span>
        parser.parseString data, <span class="hljs-function"><span class="hljs-params">(err, result)</span> -&gt;</span>
            console.dir result
                console.log <span class="hljs-string">'Done.'</span></pre><p>但是,如果你忘记使用<code>new</code>关键字创建一个新的<code>Parser</code>对象会怎么样?从0.2.8开始,你也可以不使用这个关键字。从0.2.8开始,你可以不使用它,在这种情况下,xml2js会帮助你添加它,以保证不会再出现意外和莫名其妙的bug!</p><h3 id="item-2-3">Promise 用法</h3><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="var xml2js = require('xml2js');
                  var xml = '<foo></foo>';
                  
                  // 1. With parser
                  var parser = new xml2js.Parser(/* options */);
                  parser.parseStringPromise(xml).then(function (result) {
                    console.dir(result);
                      console.log('Done');
                      })
                      .catch(function (err) {
                        // Failed
                        });
                        
                        // 2. Without parser
                        xml2js.parseStringPromise(xml /*, options */).then(function (result) {
                          console.dir(result);
                            console.log('Done');
                            })
                            .catch(function (err) {
                              // Failed
                              });" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
var xml2js = require('xml2js');
var xml = '<foo></foo>';

// 1. With parser var parser = new xml2js.Parser(/* options */); parser.parseStringPromise(xml).then(function (result) { console.dir(result); console.log('Done'); }) .catch(function (err) { // Failed });

  <span class="hljs-comment">// 2. Without parser</span>
  xml2js.<span class="hljs-title function_">parseStringPromise</span>(xml <span class="hljs-comment">/*, options */</span>).<span class="hljs-title function_">then</span>(<span class="hljs-keyword">function</span> (<span class="hljs-params">result</span>) {
    <span class="hljs-variable language_">console</span>.<span class="hljs-title function_">dir</span>(result);
      <span class="hljs-variable language_">console</span>.<span class="hljs-title function_">log</span>(<span class="hljs-string">'Done'</span>);
      })
      .<span class="hljs-title function_">catch</span>(<span class="hljs-keyword">function</span> (<span class="hljs-params">err</span>) {
        <span class="hljs-comment">// Failed</span>
        });</pre><ol><li>使用解析器进行解析:通过创建xml2js解析器的实例,调用解析器的<code>parseStringPromise</code>方法对XML数据进行解析,并通过<code>.then()</code>方法处理解析成功的结果,通过<code>.catch()</code>方法处理解析过程中的错误。</li><li>不使用解析器直接解析:直接调用xml2js库的<code>parseStringPromise</code>方法对XML数据进行解析,通过<code>.then()</code>方法处理解析成功的结果,通过<code>.catch()</code>方法处理解析过程中的错误。这种方法省去了创建解析器实例的步骤,直接调用库函数进行解析。</li></ol><h2 id="item-3">使用 XML 构建器</h2><p>自 0.4.0 版本起,xml2js 还支持使用对象来构建 XML。下面是一个示例:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="const xml2js = require('xml2js');
                  
                  const obj = {name: &quot;Super&quot;, Surname: &quot;Man&quot;, age: 23};
                  
                  const builder = new xml2js.Builder();
                  const xml = builder.buildObject(obj);" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">const</span> xml2js = <span class="hljs-built_in">require</span>(<span class="hljs-string">'xml2js'</span>);
      
      <span class="hljs-keyword">const</span> obj = {<span class="hljs-attr">name</span>: <span class="hljs-string">"Super"</span>, <span class="hljs-title class_">Surname</span>: <span class="hljs-string">"Man"</span>, <span class="hljs-attr">age</span>: <span class="hljs-number">23</span>};
      
      <span class="hljs-keyword">const</span> builder = <span class="hljs-keyword">new</span> xml2js.<span class="hljs-title class_">Builder</span>();
      <span class="hljs-keyword">const</span> xml = builder.<span class="hljs-title function_">buildObject</span>(obj);</pre><p>上述代码将生成如下的 XML:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="<?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;yes&quot;?>
                  <root>
                    <name>Super</name>
                      <Surname>Man</Surname>
                        <age>23</age>
                        </root>" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="xml hljs language-xml"><span class="hljs-meta">&lt;?xml version=<span class="hljs-string">"1.0"</span> encoding=<span class="hljs-string">"UTF-8"</span> standalone=<span class="hljs-string">"yes"</span>?&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">root</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">name</span>&gt;</span>Super<span class="hljs-tag">&lt;/<span class="hljs-name">name</span>&gt;</span>
          <span class="hljs-tag">&lt;<span class="hljs-name">Surname</span>&gt;</span>Man<span class="hljs-tag">&lt;/<span class="hljs-name">Surname</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">age</span>&gt;</span>23<span class="hljs-tag">&lt;/<span class="hljs-name">age</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">root</span>&gt;</span></pre><p>通过设置 <code>cdata</code> 选项为 <code>true</code>,可以支持写入 CDATA。</p><h3 id="item-3-4">指定属性</h3><p>使用 xml2js,你可以指定 XML 元素的属性。下面是一个示例:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="const xml2js = require('xml2js');
                  
                  const obj = {root: {$: {id: &quot;my id&quot;}, _: &quot;my inner text&quot;}};
                  
                  const builder = new xml2js.Builder();
                  const xml = builder.buildObject(obj);" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">const</span> xml2js = <span class="hljs-built_in">require</span>(<span class="hljs-string">'xml2js'</span>);
      
      <span class="hljs-keyword">const</span> obj = {<span class="hljs-attr">root</span>: {<span class="hljs-attr">$</span>: {<span class="hljs-attr">id</span>: <span class="hljs-string">"my id"</span>}, <span class="hljs-attr">_</span>: <span class="hljs-string">"my inner text"</span>}};
      
      <span class="hljs-keyword">const</span> builder = <span class="hljs-keyword">new</span> xml2js.<span class="hljs-title class_">Builder</span>();
      <span class="hljs-keyword">const</span> xml = builder.<span class="hljs-title function_">buildObject</span>(obj);</pre><p>上述代码将生成如下的 XML:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="<?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;yes&quot;?>
                  <root id=&quot;my id&quot;>my inner text</root>" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="xml hljs language-xml"><span class="hljs-meta">&lt;?xml version=<span class="hljs-string">"1.0"</span> encoding=<span class="hljs-string">"UTF-8"</span> standalone=<span class="hljs-string">"yes"</span>?&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">root</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"my id"</span>&gt;</span>my inner text<span class="hljs-tag">&lt;/<span class="hljs-name">root</span>&gt;</span></pre><h3 id="item-3-5">添加 xmlns 属性</h3><p>xml2js 还支持在生成的 XML 中添加 XML 命名空间前缀和 URI 对,通过使用 <code>xmlns</code> 属性。</p><p>在根元素上声明默认命名空间的示例:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="const obj = { 
                    Foo: {
                        $: {
                              &quot;xmlns&quot;: &quot;http://foo.com&quot;
                                  }   
                                    }
                                    };" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">const</span> obj = { 
        <span class="hljs-title class_">Foo</span>: {
            <span class="hljs-attr">$</span>: {
                  <span class="hljs-string">"xmlns"</span>: <span class="hljs-string">"http://foo.com"</span>
                      }   
                        }
                        };</pre><p>通过调用 <code>buildObject(obj)</code> 方法,将生成以下 XML:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="<Foo xmlns=&quot;http://foo.com&quot;/>" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="xml hljs language-xml"><span class="hljs-tag">&lt;<span class="hljs-name">Foo</span> <span class="hljs-attr">xmlns</span>=<span class="hljs-string">"http://foo.com"</span>/&gt;</span></pre><p>在非根元素上声明非默认命名空间的示例:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="const obj = {
                    'foo:Foo': {
                        $: {
                              'xmlns:foo': 'http://foo.com'
                                  },
                                      'bar:Bar': {
                                            $: {
                                                    'xmlns:bar': 'http://bar.com'
                                                          }
                                                              }
                                                                }
                                                                };" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">const</span> obj = {
        <span class="hljs-string">'foo:Foo'</span>: {
            <span class="hljs-attr">$</span>: {
                  <span class="hljs-string">'xmlns:foo'</span>: <span class="hljs-string">'http://foo.com'</span>
                      },
                          <span class="hljs-string">'bar:Bar'</span>: {
                                <span class="hljs-attr">$</span>: {
                                        <span class="hljs-string">'xmlns:bar'</span>: <span class="hljs-string">'http://bar.com'</span>
                                              }
                                                  }
                                                    }
                                                    };</pre><p>通过调用 <code>buildObject(obj)</code> 方法,将生成以下 XML:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="<foo:Foo xmlns:foo=&quot;http://foo.com&quot;>
                    <bar:Bar xmlns:bar=&quot;http://bar.com&quot;/>
                    </foo:Foo>" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="xml hljs language-xml"><span class="hljs-tag">&lt;<span class="hljs-name">foo:Foo</span> <span class="hljs-attr">xmlns:foo</span>=<span class="hljs-string">"http://foo.com"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">bar:Bar</span> <span class="hljs-attr">xmlns:bar</span>=<span class="hljs-string">"http://bar.com"</span>/&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">foo:Foo</span>&gt;</span></pre><h3 id="item-3-6">处理属性、标签名和值</h3><p>自 0.4.1 版本起,你可以选择提供解析器的属性名和标签名处理器,以及元素值处理器(自 0.4.14 版本起,还可以提供属性值处理器)。</p><p>下面是一个示例,演示如何将属性名和标签名转换为大写:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="function nameToUpperCase(name) {
                    return name.toUpperCase();
                    }
                    
                    // 将所有属性和标签名及其值转换为大写
                    parseString(xml, {
                      tagNameProcessors: [nameToUpperCase],
                        attrNameProcessors: [nameToUpperCase],
                          valueProcessors: [nameToUpperCase],
                            attrValueProcessors: [nameToUpperCase]
                            }, function (err, result) {
                              // 处理后的数据
                              });" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">function</span> <span class="hljs-title function_">nameToUpperCase</span>(<span class="hljs-params">name</span>) {
        <span class="hljs-keyword">return</span> name.<span class="hljs-title function_">toUpperCase</span>();
        }
        
        <span class="hljs-comment">// 将所有属性和标签名及其值转换为大写</span>
        <span class="hljs-title function_">parseString</span>(xml, {
          <span class="hljs-attr">tagNameProcessors</span>: [nameToUpperCase],
            <span class="hljs-attr">attrNameProcessors</span>: [nameToUpperCase],
              <span class="hljs-attr">valueProcessors</span>: [nameToUpperCase],
                <span class="hljs-attr">attrValueProcessors</span>: [nameToUpperCase]
                }, <span class="hljs-keyword">function</span> (<span class="hljs-params">err, result</span>) {
                  <span class="hljs-comment">// 处理后的数据</span>
                  });</pre><p><code>tagNameProcessors</code><code>attrNameProcessors</code> 选项接受一个函数数组,函数的签名如下:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="function (name) {
                    // 对 `name` 做一些处理
                      return name;
                      }" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">function</span> (<span class="hljs-params">name</span>) {
        <span class="hljs-comment">// 对 `name` 做一些处理</span>
          <span class="hljs-keyword">return</span> name;
          }</pre><p><code>attrValueProcessors</code><code>valueProcessors</code> 选项接受一个函数数组,函数的签名如下:</p><div class="widget-codetool" style="display: none;">
      <div class="widget-codetool--inner">
                  <button type="button" class="btn btn-dark rounded-0 sflex-center copyCode" data-toggle="tooltip" data-placement="top" data-clipboard-text="function (value, name) {
                    // `name` 将是节点名或属性名
                      // 对 `value` 做一些处理,可以根据节点或属性名自定义处理方式
                        return value;
                        }" aria-label="复制" data-bs-original-title="复制">
                      <i class="far fa-copy"></i>
          </button>
</div>
      </div><pre class="javascript hljs language-javascript"><span class="hljs-keyword">function</span> (<span class="hljs-params">value, name</span>) {
        <span class="hljs-comment">// `name` 将是节点名或属性名</span>
          <span class="hljs-comment">// 对 `value` 做一些处理,可以根据节点或属性名自定义处理方式</span>
            <span class="hljs-keyword">return</span> value;
            }</pre><p>xml2js 提供了一些内置的处理器,可以在 <code>lib/processors.js</code> 文件中找到:</p><ul><li><code>normalize</code>:将名称转换为小写(当 <code>options.normalize</code> 设置为 <code>true</code> 时自动使用)</li><li><code>firstCharLowerCase</code>:将首字母转换为小写。例如,'MyTagName' 变为 'myTagName'</li><li><code>stripPrefix</code>:去除 XML 命名空间前缀。例如,<code>&lt;foo:Bar/&gt;</code> 将变为 'Bar'(注意:xmlns 前缀不会被去除)</li><li><code>parseNumbers</code>:将类似整数的字符串解析为整数,将类似浮点数的字符串解析为浮点数。例如,"0" 变为 0,"15.56" 变为 15.56</li><li><code>parseBooleans</code>:将类似布尔值的字符串解析为布尔值。例如,"true" 变为 true,"false" 变为 false</li></ul><p>xml2js 提供了强大的功能,使得在 Node.js 应用程序中解析和构建 XML 变得简单而灵活。无论你是需要解析复杂的 XML 文档,还是需要构建自定义的 XML 输出,xml2js 都是一个值得尝试的工具。</p><p>参考文档:<a target="_blank" href="https://link.segmentfault.com/?enc=etR%2BztW16kPCxB2NU7oIPQ%3D%3D.Cwbulk4XaYVwVQl7fU3ltbURkK1TcyTIHwRRPM8izEKbuj9UTgoUJhTI2v3riukOtnvQ3QQ0bk1sK%2F%2FiZ7u%2BDg%3D%3D">Leonidas-from-XIV/node-xml2js: XML to JavaScript object converter.</a></p>