第一个版本文本对比

那么我直接一个文本一个文本的对比应该就可以了。不管了，先码一下。

但问题又来了，我的版本是网页对比呢这个对比是纯文本的对比。显然没有实现我的需求。

灵光一闪，想到了了思路。使用dom节点的方式依次对比不就行了？不同的节点直接append作为子节点？不错的想法。

具体思路就是：

版本：<p>12</p><p>3</p>
目标：<p>4</p><p>12</p><p>4</p>
结果：<p reduce>4</p><p normal>12</><p add>3</p><p reduce>4</p>

reduce 意味着 减少的节点
add 意味着 增加的节点
normal 意味着未变更的节点

嗯那么需要先实现一下字符串转换为dom的层级结构。

const MatchTag = /<\/?.+?>/g;

/**
 * 拆分字符串为单个节点字符串
 */
function splitHtml(html: string): { str: string; tag: boolean }[] {
  const response: { str: string; tag: boolean }[] = [];

  const nodes = Array.from(html.matchAll(MatchTag));

  let start = 0;
  let end = 0;
  const tempNodes: string[] = [];

  for (let i = 0; i < nodes.length; i++) {
    const [value] = nodes[i];
    const { index } = nodes[i];
    end = i + 1;

    // 拆分前置字符串
    if (start < index) {
      response.push({ str: html.slice(start, index), tag: false });
    }

    const { name: tagName, close } = getTagName(value);

    tempNodes.push(tagName);

    if (close) {
      throw new Error('process error');
    }

    // 拆分节点字符串
    while (tempNodes.length) {
      const current = nodes[end];

      if (!current) {
        // 未匹配闭合当作单节点处理
        response.push({
          str: html.slice(index, index + value.length),
          tag: true,
        });
        start = index + value.length;
        end = end + 1;
        tempNodes.length = 0;
        continue;
      }

      const [currentValue] = current;
      const { index: currentIndex } = current;
      const { name, close: closed } = getTagName(currentValue);

      if (!closed) {
        end = end + 1;
        tempNodes.push(name);
        continue;
      }

      let prevName = tempNodes[tempNodes.length - 1];

      while (tempNodes.length && name !== prevName) {
        tempNodes.pop();
        prevName = tempNodes[tempNodes.length - 1];
      }

      if (tempNodes.length <= 1 && name === tagName) {
        response.push({
          str: html.slice(nodes[i].index, currentIndex + currentValue.length),
          tag: true,
        });

        start = currentIndex + currentValue.length;
        i = end;
        tempNodes.pop();
        break;
      }

      end = end + 1;
      tempNodes.pop();
    }

    const current = nodes[nodes.length - 1];

    const [currentValue] = current;
    const { index: currentIndex } = current;
    // 拆分后置字符串
    if (i >= nodes.length - 1 && currentIndex + currentValue.length < html.length) {
      response.push({
        str: html.slice(currentIndex + currentValue.length, html.length),
        tag: false,
      });
    }
  }

  if (nodes.length === 0) {
    response.push({ str: html, tag: false });
  }

  return response;
}

这里看出来了是通过正则匹配闭合节点的方式一个一个将字符串拆分出来的。当然简单一点也可以直接 document.createElement(div).innerHTML 来做

节点拆分出来之后可能是下面这样

str: 123<p>123</p><table><tr></td>123</td></tr></table>

str0: 123
str1: <p>123</p>
str2: <table><tr></td>123</td></tr></table>

还需要一个步骤就是将所有节点转换为单个节点

/**
 * 拆分parent child html
 */
function splitPCHtml(html: string): { root: string; child: string } {
  const count = Array.from(html.matchAll(MatchTag));
  const first = count[0];
  const last = count[count.length - 1];

  if (!count.length || first === last) {
    return { root: html, child: '' };
  }

  const [value] = first;
  const { index } = last;

  return { root: html.slice(0, value.length), child: html.slice(value.length, index) };
}

上面这一段就是将多个节点拆分为单节点及child子节点的代码。子节点然后再通过 splitPCHtml 继续拆分为细小节点

然后通过对比对应的内容即可实现我所需要的功能了。

这是个简单版本，当你对比Table的时候就能发现。我将继续完善...

HTML对比功能的实现

第一个版本 文本对比

第一个版本文本对比