一篇震撼到我的开发经历

37 阅读9分钟

最近新入职了一家公司,被彻底的震撼到了,事情是这样的,让我对某app的商品搜索进行关键词高亮的优化任务,刚接到这个任务时,我内心其实是有些不以为然的:“搜索?不就是个输入框加个按钮吗?用户输入,后端返回,前端展示,顶多再加个关键词高亮,能有多复杂?”

需求是这样的:

输入词高亮逻辑:

  1. 全命中时高亮

  2. 部分匹配时高亮

    a. SUG命中输入完成的字进行高亮(应该是用拼音能召回的词高亮):e.g. 输入词=小儿kechuan(kechuan输入中),sug词=小儿咳喘口服液,小儿咳喘高亮“

    b. 输入词为拼音时,召回的词拼音同输入词一致时进行高亮:e.g. 输入词=buluofen,sug词=布洛芬缓释胶囊,布洛芬高亮

    c. 按输入词的连续进行高亮,不单个词高亮

    d. 输入词中有错别字或输入错误时,其他可匹配字高亮:e.g. 输入词=不洛芬,sug词=布洛芬洛芬高亮

    e. 输入词大小写:输入词=MOVEFREE,召回movefreemovefree高亮

关键词检索是后端实现的,后端给出的数据是这样的:

[
  {
    "content": "阿莫西林",
    "type": 1,
    "query_addition_info": {
      "type": 11002,
      "icon_url": "",
      "show_text": "阿莫西林 | 国产",
      "activity_filter_codes": [],
      "label_type": "DiseaseSymptomEffect",
      "activity": "DISEASESYMPTOMEFFECT",
      "red_words": [
        {
          "word": "阿莫西林",
          "index": 0
        }
      ]
    },
    "shenNongInterveneAdditionInfo": {
      "sugJump": 0,
      "isInterveneLocation": 0
    }
  }
]

1. 从“不就一个搜索框”到“分布式算法实战”:我的搜索高亮优化认知升级

然而,现实给了我一记响亮的“piapia”警告。  当我真正深入这个看似简单的需求后,才震惊地发现,自己差点错过了一个深入分布式系统核心的绝佳机会

轻敌:我曾以为的“简单”:

我的初始想法简单粗暴:

  1. 前端监听输入框
  2. 把关键词扔给后端
  3. 后端像查字典一样 SELECT * FROM products WHERE name LIKE '%关键词%'
  4. 前端拿到数据,用字符串替换做个高亮,齐活

我当时完全没意识到,这个“想当然”的方案,在亿级数据的洪流面前,连一分钟都撑不住就会彻底崩溃

震撼:一个搜索框背后的算法宇宙:

直到我亲手处理后端返回的 red_words 高亮数据,并需要在前端精准渲染时,我才恍然大悟:原来我从头到尾都在和各种各样的算法打交道!  一个简单的搜索高亮,竟然是我过去所学数据结构的集中实战演练场。

我所实现的不再是简单的字符串替换,而是一个微型的、高效的前端渲染引擎,其核心由五大算法思想构筑:

1. 排序算法 - 建立秩序的基础

const sortedRedWords = redwords.sort((a, b) => a.index - b.index);

思想:在开始任何复杂操作前,先通过比较排序让高亮词按位置有序排队

解决:避免了后续处理中可能出现的区间位置冲突,为后续算法奠定了可靠的输入基础

2. 区间分割算法 - 精准的“文本手术刀”

const overlapStart = Math.max(segmentStart, index);
const overlapEnd = Math.min(segmentEnd, wordEnd);

思想:运用计算几何/线段树的核心思想,将文本字符串视为一条线段,根据高亮词的区间进行精准切割

解决:完美处理了高亮词在文本中任意位置出现的情况,实现了像素级的精度控制

3. 线段合并算法 - 极致的性能优化

function mergeSegments(segments) {
  // 合并相邻且状态相同的文本段
  if (currentSegment.isHighLight === segment.isHighLight) {
    currentSegment.text += segment.text;
  }
}

思想:在生成最终片段后,合并相邻且状态相同的文本段

解决:直接将DOM节点数量降至最低,对于长列表渲染性能提升巨大。这是从“能用”到“高效”的关键一步

4. 贪心算法思想 - 步步为营的最优解

时间复杂度: O(n × m) - 需对每个高亮词遍历现有分段

空间复杂度: O(k) - 与最终分段数量成正比
在实际场景中,由于高亮词数量有限(n<5),该算法性能表现优异。

思想:按顺序处理每个高亮词,在每一步都采取当前状态下的最优操作(分割或保留),并更新状态

解决:将复杂的多区间重叠问题,分解为一系列可管理的子问题,使算法设计清晰且高效

5. 重叠区间处理算法 - 复杂场景的智慧

思想:通过严格的条件判断,智能识别和处理多个高亮词之间可能出现的嵌套、包含或交叉关系

解决:确保了在任何复杂的匹配情况下,高亮逻辑都不会混乱,渲染结果始终正确

顿悟:前端渲染中的“分布式思维”

更让我感到不可思议的是,我在前端解决的区间分割与合并问题,在思想上竟然与后端搜索引擎的分布式分片与结果归并如出一辙!

我的区间分割算法,就好比搜索引擎将数据水平分片到不同节点。

我的线段合并算法,就好比协调节点将来自各个分片的结果进行汇总、排序。

我的贪心算法流程,就好比一次Scatter-Gather分布式查询。

我意识到,我不仅仅是在写一个高亮功能,而是在一个微观层面上,实践着与后端分布式系统同源的算法哲学。 这让我对“性能优化”有了全新的理解——它不仅仅是一两个奇技淫巧,而是深刻的数据结构与算法思想在具体业务场景中的落地。

通过这个项目,我深刻理解了搜索高亮的架构哲学:

  1. 数据边界:前端无法承载亿级商品数据的拼音转换、模糊匹配计算
  2. 实时性要求:新商品上架、搜索策略调整需要实时生效,只能由后端控制
  3. 计算密集型:中文分词、拼音匹配、相关性排序等都是CPU密集型任务
  4. 一致性保障:保证所有客户端显示的高亮逻辑一致,必须由服务端统一计算

代码如下:

1. 类型定义 (src/types/search.ts)

export interface RedWord {
  word: string;
  index: number;
}

export interface QueryAdditionInfo {
  type: number;
  icon_url: string;
  show_text: string;
  label_type: string;
  red_words: RedWord[];
}

export interface SearchItem {
  id: number;
  content: string;
  type: number;
  query_addition_info: QueryAdditionInfo;
}

export interface SearchResult {
  success: boolean;
  data: SearchItem[];
}

export interface TextSegment {
  text: string;
  isHighLight: boolean;
  abKey?: string;
}

2. 模拟数据 (src/mock/data.ts)

export const mockSearchData: any[] = [
  {
    id: 1,
    content: "阿莫西林胶囊 0.25g*24粒",
    type: 1,
    query_addition_info: {
      type: 11002,
      icon_url: "",
      show_text: "阿莫西林 | 国产",
      label_type: "Drug",
      red_words: [
        {
          word: "阿莫西林",
          index: 0
        }
      ]
    }
  },
  {
    id: 2,
    content: "阿莫西林克拉维酸钾片",
    type: 1,
    query_addition_info: {
      type: 11002,
      icon_url: "",
      show_text: "抗生素 | 处方药",
      label_type: "Drug",
      red_words: [
        {
          word: "阿莫西林",
          index: 0
        },
        {
          word: "克拉维酸",
          index: 4
        }
      ]
    }
  },
  {
    id: 3,
    content: "头孢克肟分散片 100mg*6片",
    type: 1,
    query_addition_info: {
      type: 11002,
      icon_url: "",
      show_text: "头孢类 | 抗生素",
      label_type: "Drug",
      red_words: [
        {
          word: "头孢",
          index: 0
        },
        {
          word: "分散片",
          index: 4
        }
      ]
    }
  },
  {
    id: 4,
    content: "布洛芬缓释胶囊 0.3g*20粒",
    type: 1,
    query_addition_info: {
      type: 11002,
      icon_url: "",
      show_text: "止痛药 | 非处方",
      label_type: "Drug",
      red_words: [
        {
          word: "布洛芬",
          index: 0
        }
      ]
    }
  },
  {
    id: 5,
    content: "维生素C泡腾片 1000mg*10片",
    type: 1,
    query_addition_info: {
      type: 11002,
      icon_url: "",
      show_text: "维生素 | 保健品",
      label_type: "Drug",
      red_words: [] // 测试没有red_words的情况
    }
  }
];

// 模拟API请求
export const mockSearchAPI = (keyword: string): Promise<any> => {
  return new Promise((resolve) => {
    setTimeout(() => {
      const results = mockSearchData
        .map(item => {
          const redWords = generateRedWords(item.content, keyword);
          return {
            ...item,
            query_addition_info: {
              ...item.query_addition_info,
              red_words: redWords
            }
          };
        })
        .filter(item => 
          item.content.toLowerCase().includes(keyword.toLowerCase()) || 
          item.query_addition_info.show_text.toLowerCase().includes(keyword.toLowerCase())
        );
      
      resolve({
        success: true,
        data: results
      });
    }, 500);
  });
};

// 模拟后端生成red_words的逻辑
function generateRedWords(content: string, keyword: string): any[] {
  if (!keyword) return [];
  
  const redWords: any[] = [];
  const lowerContent = content.toLowerCase();
  const lowerKeyword = keyword.toLowerCase();
  
  // 完全匹配
  let startIndex = lowerContent.indexOf(lowerKeyword);
  if (startIndex !== -1) {
    redWords.push({
      word: content.substring(startIndex, startIndex + keyword.length),
      index: startIndex
    });
  }
  
  // 分词匹配
  const keywordParts = keyword.split('').filter(char => char.trim());
  keywordParts.forEach(part => {
    if (part.length < 2) return;
    
    let partIndex = lowerContent.indexOf(part.toLowerCase());
    while (partIndex !== -1) {
      const isOverlap = redWords.some(redWord => 
        partIndex >= redWord.index && partIndex < redWord.index + redWord.word.length
      );
      
      if (!isOverlap) {
        redWords.push({
          word: content.substring(partIndex, partIndex + part.length),
          index: partIndex
        });
      }
      
      partIndex = lowerContent.indexOf(part.toLowerCase(), partIndex + 1);
    }
  });
  
  return redWords;
}

3. 高亮算法工具 (src/utils/highlightAlgorithms.ts)

export const matchSplitItemTitle = (title: string, redwords: any[], abKey: string = 'A1'): any[] => {
  if (!redwords || redwords.length === 0) {
    return splitItemTitleFallback(title, '', abKey);
  }
  
  // 1. 排序算法
  const sortedRedWords = [...redwords].sort((a, b) => a.index - b.index);
  
  let segments: Array<any & { start?: number; end?: number }> = [{ 
    text: title, 
    isHighLight: false,
    start: 0,
    end: title.length
  }];
  
  // 2. 区间分割算法
  for (let i = 0; i < sortedRedWords.length; i++) {
    const redWord = sortedRedWords[i];
    const word = redWord.word;
    const index = redWord.index;
    const wordEnd = index + word.length;
    
    let newSegments: Array<any & { start?: number; end?: number }> = [];
    let segmentProcessed = false;
    
    for (let j = 0; j < segments.length; j++) {
      const segment = segments[j];
      const segmentStart = segment.start || 0;
      const segmentEnd = segment.end || segment.text.length;
      
      // 重叠区间检测
      if (segment.isHighLight || index >= segmentEnd || wordEnd <= segmentStart) {
        newSegments.push(segment);
        continue;
      }
      
      segmentProcessed = true;
      
      let beforeText = '';
      let highlightText = '';
      let afterText = '';
      
      if (index > segmentStart) {
        beforeText = segment.text.substring(0, index - segmentStart);
        newSegments.push({
          text: beforeText,
          isHighLight: false,
          start: segmentStart,
          end: index
        });
      }
      
      const overlapStart = Math.max(segmentStart, index);
      const overlapEnd = Math.min(segmentEnd, wordEnd);
      highlightText = segment.text.substring(overlapStart - segmentStart, overlapEnd - segmentStart);
      
      if (highlightText.length > 0) {
        newSegments.push({
          text: highlightText,
          isHighLight: true,
          abKey: abKey,
          start: overlapStart,
          end: overlapEnd
        });
      }
      
      if (wordEnd < segmentEnd) {
        afterText = segment.text.substring(overlapEnd - segmentStart);
        newSegments.push({
          text: afterText,
          isHighLight: false,
          start: overlapEnd,
          end: segmentEnd
        });
      }
    }
    
    if (segmentProcessed) {
      segments = newSegments;
    }
  }
  
  const finalSegments: any[] = segments.map(segment => ({
    text: segment.text,
    isHighLight: segment.isHighLight,
    abKey: segment.abKey
  }));
  
  // 3. 线段合并算法
  return mergeSegments(finalSegments);
};

const mergeSegments = (segments: any[]): any[] => {
  if (segments.length <= 1) {
    return segments;
  }
  
  const merged: any[] = [];
  let currentSegment: any = { ...segments[0] };
  
  for (let i = 1; i < segments.length; i++) {
    const segment = segments[i];
    
    if (currentSegment.isHighLight === segment.isHighLight && currentSegment.abKey === segment.abKey) {
      currentSegment.text += segment.text;
    } else {
      merged.push(currentSegment);
      currentSegment = { ...segment };
    }
  }
  
  merged.push(currentSegment);
  return merged;
};

const splitItemTitleFallback = (title: string, keyword: string, abKey: string): any[] => {
  if (!title) {
    return [{ text: title || '', isHighLight: false }];
  }
  
  if (!keyword) {
    return [{ text: title, isHighLight: false }];
  }
  
  const exactMatchIndex = title.indexOf(keyword);
  if (exactMatchIndex !== -1) {
    const splits = title.split(keyword);
    const textArrObj: any[] = [];
    
    splits.forEach((item, index) => {
      if (item) {
        textArrObj.push({
          text: item,
          isHighLight: false,
        });
      }
      
      if (index < splits.length - 1) {
        textArrObj.push({
          text: keyword,
          isHighLight: true,
          abKey,
        });
      }
    });
    
    return textArrObj;
  }
  
  const matchResult = findBestMatch(title, keyword);
  if (matchResult) {
    const { matchText, startIndex } = matchResult;
    const segments: any[] = [];
    
    if (startIndex > 0) {
      segments.push({
        text: title.substring(0, startIndex),
        isHighLight: false,
      });
    }
    
    segments.push({
      text: matchText,
      isHighLight: true,
      abKey,
    });
    
    if (startIndex + matchText.length < title.length) {
      segments.push({
        text: title.substring(startIndex + matchText.length),
        isHighLight: false,
      });
    }
    
    return segments;
  }
  
  return [{ text: title, isHighLight: false }];
};

const findBestMatch = (title: string, keyword: string): { matchText: string; startIndex: number } | null => {
  const lowerTitle = title.toLowerCase();
  const lowerKeyword = keyword.toLowerCase();
  
  let bestMatch: { matchText: string; startIndex: number } | null = null;
  let maxLength = 0;
  
  for (let i = 0; i < lowerTitle.length; i++) {
    for (let j = i + 1; j <= lowerTitle.length; j++) {
      const substring = lowerTitle.substring(i, j);
      
      if (lowerKeyword.includes(substring) && substring.length > maxLength) {
        maxLength = substring.length;
        bestMatch = {
          matchText: title.substring(i, j),
          startIndex: i
        };
      }
    }
  }
  
  return maxLength >= 1 ? bestMatch : null;
};

4. 搜索Hooks (src/hooks/useSearch.ts)

import { ref } from 'vue';
import { mockSearchAPI } from '../mock/data';

export const useSearch = () => {
  const results = ref([]);
  const loading = ref(false);
  const error = ref<string | null>(null);
  
  const search = async (keyword: string) => {
    if (!keyword.trim()) {
      results.value = [];
      return;
    }
    
    loading.value = true;
    error.value = null;
    
    try {
      const response = await mockSearchAPI(keyword);
      if (response.success) {
        results.value = response.data;
      } else {
        error.value = '搜索失败';
      }
    } catch (err) {
      error.value = '网络请求失败';
      console.error('Search error:', err);
    } finally {
      loading.value = false;
    }
  };
  
  return {
    results,
    loading,
    error,
    search
  };
};

5. 高亮文本组件 (src/components/HighlightText.vue)

<template>
  <span :class="className">
    <span
      v-for="(segment, index) in segments"
      :key="index"
      :class="segment.isHighLight ? 'highlight-text' : ''"
      :style="segment.isHighLight ? highlightStyle : {}"
    >
      {{ segment.text }}
    </span>
  </span>
</template>

<script setup lang="ts">
import { computed } from 'vue';
import type { RedWord, TextSegment } from '../types/search';
import { matchSplitItemTitle } from '../utils/highlightAlgorithms';

interface Props {
  content: string;
  redWords?: RedWord[];
  keyword?: string;
  className?: string;
}

const props = withDefaults(defineProps<Props>(), {
  redWords: () => [],
  keyword: '',
  className: ''
});

const segments = computed<TextSegment[]>(() => 
  matchSplitItemTitle(props.content, props.redWords, 'A1')
);

const highlightStyle = {
  color: '#f00',
  fontWeight: '600'
};
</script>

<style scoped>
.highlight-text {
  color: #f00;
  font-weight: 600;
}
</style>

6. 搜索结果组件 (src/components/SearchResults.vue)

<template>
  <div>
    <div v-if="loading">搜索中...</div>
    <div v-else-if="error">{{ error }}</div>
    <div v-else-if="results.length === 0">暂无搜索结果</div>
    <div v-else>
      <div v-for="item in results" :key="item.id">
        <div>
          <HighlightText :content="item.content" :red-words="item.query_addition_info.red_words" />
          <div>{{ item.query_addition_info.show_text }}</div>
        </div>
      </div>
    </div>
  </div>
</template>
<script setup lang="ts">
import { type SearchItem } from '../types/search';
import HighlightText from './HighlightText.vue';

interface Props {
  results: SearchItem[];
  loading: boolean;
  error: string | null;
}

defineProps<Props>();
</script>

7. 搜索框组件 (src/components/SearchBox.vue)

<template>
  <form @submit.prevent="handleSubmit">
    <div>
      <input
        v-model="keyword"
        type="text"
        placeholder="输入药品名称搜索..."
        :disabled="loading"
      />
      <button type="button" @click="handleClear" v-if="keyword.trim()">clear</button>
      <button
        type="submit"
        :disabled="loading || !keyword.trim()"
      >
        {{ loading ? '搜索中...' : '搜索' }}
      </button>
    </div>
  </form>
</template>
<script setup lang="ts">
import { ref } from 'vue';
interface Emits {
  (e: 'search', keyword: string): void;
}
const emit = defineEmits<Emits>();
interface Props {
  loading: boolean;
}
defineProps<Props>();
const keyword = ref('');
const handleSubmit = () => {
  if (keyword.value.trim()) {
    emit('search', keyword.value.trim());
  }
};
const handleClear = () => {
  keyword.value = '';
  emit('search', keyword.value.trim());
}
</script>

8. 主应用组件 (src/App.vue)

<template>
  <div class="app">
    <SearchBox :loading="loading" @search="handleSearch" />
    <SearchResults :results="results" :loading="loading" :error="error" />
  </div>
</template>
<script setup lang="ts">
import SearchBox from './components/SearchBox.vue';
import SearchResults from './components/SearchResults.vue';
import { useSearch } from './composables/useSearch';
const { results, loading, error, search } = useSearch();
const handleSearch = (keyword: string) => {
  search(keyword);
};
</script>

image.png