最近新入职了一家公司,被彻底的震撼到了,事情是这样的,让我对某app的商品搜索进行关键词高亮的优化任务,刚接到这个任务时,我内心其实是有些不以为然的:“搜索?不就是个输入框加个按钮吗?用户输入,后端返回,前端展示,顶多再加个关键词高亮,能有多复杂?”
需求是这样的:
输入词高亮逻辑:
-
全命中时高亮
-
部分匹配时高亮
a. SUG命中输入完成的字进行高亮(应该是用拼音能召回的词高亮):e.g. 输入词=小儿kechuan(kechuan输入中),sug词=
小儿咳喘口服液,小儿咳喘高亮“b. 输入词为拼音时,召回的词拼音同输入词一致时进行高亮:e.g. 输入词=buluofen,sug词=
布洛芬缓释胶囊,布洛芬高亮c. 按输入词的连续进行高亮,不单个词高亮
d. 输入词中有错别字或输入错误时,其他可匹配字高亮:e.g. 输入词=不洛芬,sug词=布
洛芬,洛芬高亮e. 输入词大小写:输入词=MOVEFREE,召回
movefree,movefree高亮
关键词检索是后端实现的,后端给出的数据是这样的:
[
{
"content": "阿莫西林",
"type": 1,
"query_addition_info": {
"type": 11002,
"icon_url": "",
"show_text": "阿莫西林 | 国产",
"activity_filter_codes": [],
"label_type": "DiseaseSymptomEffect",
"activity": "DISEASESYMPTOMEFFECT",
"red_words": [
{
"word": "阿莫西林",
"index": 0
}
]
},
"shenNongInterveneAdditionInfo": {
"sugJump": 0,
"isInterveneLocation": 0
}
}
]
1. 从“不就一个搜索框”到“分布式算法实战”:我的搜索高亮优化认知升级
然而,现实给了我一记响亮的“piapia”警告。 当我真正深入这个看似简单的需求后,才震惊地发现,自己差点错过了一个深入分布式系统核心的绝佳机会
轻敌:我曾以为的“简单”:
我的初始想法简单粗暴:
- 前端监听输入框
- 把关键词扔给后端
- 后端像查字典一样
SELECT * FROM products WHERE name LIKE '%关键词%' - 前端拿到数据,用字符串替换做个高亮,齐活
我当时完全没意识到,这个“想当然”的方案,在亿级数据的洪流面前,连一分钟都撑不住就会彻底崩溃
震撼:一个搜索框背后的算法宇宙:
直到我亲手处理后端返回的 red_words 高亮数据,并需要在前端精准渲染时,我才恍然大悟:原来我从头到尾都在和各种各样的算法打交道! 一个简单的搜索高亮,竟然是我过去所学数据结构的集中实战演练场。
我所实现的不再是简单的字符串替换,而是一个微型的、高效的前端渲染引擎,其核心由五大算法思想构筑:
1. 排序算法 - 建立秩序的基础
const sortedRedWords = redwords.sort((a, b) => a.index - b.index);
思想:在开始任何复杂操作前,先通过比较排序让高亮词按位置有序排队
解决:避免了后续处理中可能出现的区间位置冲突,为后续算法奠定了可靠的输入基础
2. 区间分割算法 - 精准的“文本手术刀”
const overlapStart = Math.max(segmentStart, index);
const overlapEnd = Math.min(segmentEnd, wordEnd);
思想:运用计算几何/线段树的核心思想,将文本字符串视为一条线段,根据高亮词的区间进行精准切割
解决:完美处理了高亮词在文本中任意位置出现的情况,实现了像素级的精度控制
3. 线段合并算法 - 极致的性能优化
function mergeSegments(segments) {
// 合并相邻且状态相同的文本段
if (currentSegment.isHighLight === segment.isHighLight) {
currentSegment.text += segment.text;
}
}
思想:在生成最终片段后,合并相邻且状态相同的文本段
解决:直接将DOM节点数量降至最低,对于长列表渲染性能提升巨大。这是从“能用”到“高效”的关键一步
4. 贪心算法思想 - 步步为营的最优解
时间复杂度: O(n × m) - 需对每个高亮词遍历现有分段
空间复杂度: O(k) - 与最终分段数量成正比
在实际场景中,由于高亮词数量有限(n<5),该算法性能表现优异。
思想:按顺序处理每个高亮词,在每一步都采取当前状态下的最优操作(分割或保留),并更新状态
解决:将复杂的多区间重叠问题,分解为一系列可管理的子问题,使算法设计清晰且高效
5. 重叠区间处理算法 - 复杂场景的智慧
思想:通过严格的条件判断,智能识别和处理多个高亮词之间可能出现的嵌套、包含或交叉关系
解决:确保了在任何复杂的匹配情况下,高亮逻辑都不会混乱,渲染结果始终正确
顿悟:前端渲染中的“分布式思维”
更让我感到不可思议的是,我在前端解决的区间分割与合并问题,在思想上竟然与后端搜索引擎的分布式分片与结果归并如出一辙!
我的区间分割算法,就好比搜索引擎将数据水平分片到不同节点。
我的线段合并算法,就好比协调节点将来自各个分片的结果进行汇总、排序。
我的贪心算法流程,就好比一次Scatter-Gather分布式查询。
我意识到,我不仅仅是在写一个高亮功能,而是在一个微观层面上,实践着与后端分布式系统同源的算法哲学。 这让我对“性能优化”有了全新的理解——它不仅仅是一两个奇技淫巧,而是深刻的数据结构与算法思想在具体业务场景中的落地。
通过这个项目,我深刻理解了搜索高亮的架构哲学:
- 数据边界:前端无法承载亿级商品数据的拼音转换、模糊匹配计算
- 实时性要求:新商品上架、搜索策略调整需要实时生效,只能由后端控制
- 计算密集型:中文分词、拼音匹配、相关性排序等都是CPU密集型任务
- 一致性保障:保证所有客户端显示的高亮逻辑一致,必须由服务端统一计算
代码如下:
1. 类型定义 (src/types/search.ts)
export interface RedWord {
word: string;
index: number;
}
export interface QueryAdditionInfo {
type: number;
icon_url: string;
show_text: string;
label_type: string;
red_words: RedWord[];
}
export interface SearchItem {
id: number;
content: string;
type: number;
query_addition_info: QueryAdditionInfo;
}
export interface SearchResult {
success: boolean;
data: SearchItem[];
}
export interface TextSegment {
text: string;
isHighLight: boolean;
abKey?: string;
}
2. 模拟数据 (src/mock/data.ts)
export const mockSearchData: any[] = [
{
id: 1,
content: "阿莫西林胶囊 0.25g*24粒",
type: 1,
query_addition_info: {
type: 11002,
icon_url: "",
show_text: "阿莫西林 | 国产",
label_type: "Drug",
red_words: [
{
word: "阿莫西林",
index: 0
}
]
}
},
{
id: 2,
content: "阿莫西林克拉维酸钾片",
type: 1,
query_addition_info: {
type: 11002,
icon_url: "",
show_text: "抗生素 | 处方药",
label_type: "Drug",
red_words: [
{
word: "阿莫西林",
index: 0
},
{
word: "克拉维酸",
index: 4
}
]
}
},
{
id: 3,
content: "头孢克肟分散片 100mg*6片",
type: 1,
query_addition_info: {
type: 11002,
icon_url: "",
show_text: "头孢类 | 抗生素",
label_type: "Drug",
red_words: [
{
word: "头孢",
index: 0
},
{
word: "分散片",
index: 4
}
]
}
},
{
id: 4,
content: "布洛芬缓释胶囊 0.3g*20粒",
type: 1,
query_addition_info: {
type: 11002,
icon_url: "",
show_text: "止痛药 | 非处方",
label_type: "Drug",
red_words: [
{
word: "布洛芬",
index: 0
}
]
}
},
{
id: 5,
content: "维生素C泡腾片 1000mg*10片",
type: 1,
query_addition_info: {
type: 11002,
icon_url: "",
show_text: "维生素 | 保健品",
label_type: "Drug",
red_words: [] // 测试没有red_words的情况
}
}
];
// 模拟API请求
export const mockSearchAPI = (keyword: string): Promise<any> => {
return new Promise((resolve) => {
setTimeout(() => {
const results = mockSearchData
.map(item => {
const redWords = generateRedWords(item.content, keyword);
return {
...item,
query_addition_info: {
...item.query_addition_info,
red_words: redWords
}
};
})
.filter(item =>
item.content.toLowerCase().includes(keyword.toLowerCase()) ||
item.query_addition_info.show_text.toLowerCase().includes(keyword.toLowerCase())
);
resolve({
success: true,
data: results
});
}, 500);
});
};
// 模拟后端生成red_words的逻辑
function generateRedWords(content: string, keyword: string): any[] {
if (!keyword) return [];
const redWords: any[] = [];
const lowerContent = content.toLowerCase();
const lowerKeyword = keyword.toLowerCase();
// 完全匹配
let startIndex = lowerContent.indexOf(lowerKeyword);
if (startIndex !== -1) {
redWords.push({
word: content.substring(startIndex, startIndex + keyword.length),
index: startIndex
});
}
// 分词匹配
const keywordParts = keyword.split('').filter(char => char.trim());
keywordParts.forEach(part => {
if (part.length < 2) return;
let partIndex = lowerContent.indexOf(part.toLowerCase());
while (partIndex !== -1) {
const isOverlap = redWords.some(redWord =>
partIndex >= redWord.index && partIndex < redWord.index + redWord.word.length
);
if (!isOverlap) {
redWords.push({
word: content.substring(partIndex, partIndex + part.length),
index: partIndex
});
}
partIndex = lowerContent.indexOf(part.toLowerCase(), partIndex + 1);
}
});
return redWords;
}
3. 高亮算法工具 (src/utils/highlightAlgorithms.ts)
export const matchSplitItemTitle = (title: string, redwords: any[], abKey: string = 'A1'): any[] => {
if (!redwords || redwords.length === 0) {
return splitItemTitleFallback(title, '', abKey);
}
// 1. 排序算法
const sortedRedWords = [...redwords].sort((a, b) => a.index - b.index);
let segments: Array<any & { start?: number; end?: number }> = [{
text: title,
isHighLight: false,
start: 0,
end: title.length
}];
// 2. 区间分割算法
for (let i = 0; i < sortedRedWords.length; i++) {
const redWord = sortedRedWords[i];
const word = redWord.word;
const index = redWord.index;
const wordEnd = index + word.length;
let newSegments: Array<any & { start?: number; end?: number }> = [];
let segmentProcessed = false;
for (let j = 0; j < segments.length; j++) {
const segment = segments[j];
const segmentStart = segment.start || 0;
const segmentEnd = segment.end || segment.text.length;
// 重叠区间检测
if (segment.isHighLight || index >= segmentEnd || wordEnd <= segmentStart) {
newSegments.push(segment);
continue;
}
segmentProcessed = true;
let beforeText = '';
let highlightText = '';
let afterText = '';
if (index > segmentStart) {
beforeText = segment.text.substring(0, index - segmentStart);
newSegments.push({
text: beforeText,
isHighLight: false,
start: segmentStart,
end: index
});
}
const overlapStart = Math.max(segmentStart, index);
const overlapEnd = Math.min(segmentEnd, wordEnd);
highlightText = segment.text.substring(overlapStart - segmentStart, overlapEnd - segmentStart);
if (highlightText.length > 0) {
newSegments.push({
text: highlightText,
isHighLight: true,
abKey: abKey,
start: overlapStart,
end: overlapEnd
});
}
if (wordEnd < segmentEnd) {
afterText = segment.text.substring(overlapEnd - segmentStart);
newSegments.push({
text: afterText,
isHighLight: false,
start: overlapEnd,
end: segmentEnd
});
}
}
if (segmentProcessed) {
segments = newSegments;
}
}
const finalSegments: any[] = segments.map(segment => ({
text: segment.text,
isHighLight: segment.isHighLight,
abKey: segment.abKey
}));
// 3. 线段合并算法
return mergeSegments(finalSegments);
};
const mergeSegments = (segments: any[]): any[] => {
if (segments.length <= 1) {
return segments;
}
const merged: any[] = [];
let currentSegment: any = { ...segments[0] };
for (let i = 1; i < segments.length; i++) {
const segment = segments[i];
if (currentSegment.isHighLight === segment.isHighLight && currentSegment.abKey === segment.abKey) {
currentSegment.text += segment.text;
} else {
merged.push(currentSegment);
currentSegment = { ...segment };
}
}
merged.push(currentSegment);
return merged;
};
const splitItemTitleFallback = (title: string, keyword: string, abKey: string): any[] => {
if (!title) {
return [{ text: title || '', isHighLight: false }];
}
if (!keyword) {
return [{ text: title, isHighLight: false }];
}
const exactMatchIndex = title.indexOf(keyword);
if (exactMatchIndex !== -1) {
const splits = title.split(keyword);
const textArrObj: any[] = [];
splits.forEach((item, index) => {
if (item) {
textArrObj.push({
text: item,
isHighLight: false,
});
}
if (index < splits.length - 1) {
textArrObj.push({
text: keyword,
isHighLight: true,
abKey,
});
}
});
return textArrObj;
}
const matchResult = findBestMatch(title, keyword);
if (matchResult) {
const { matchText, startIndex } = matchResult;
const segments: any[] = [];
if (startIndex > 0) {
segments.push({
text: title.substring(0, startIndex),
isHighLight: false,
});
}
segments.push({
text: matchText,
isHighLight: true,
abKey,
});
if (startIndex + matchText.length < title.length) {
segments.push({
text: title.substring(startIndex + matchText.length),
isHighLight: false,
});
}
return segments;
}
return [{ text: title, isHighLight: false }];
};
const findBestMatch = (title: string, keyword: string): { matchText: string; startIndex: number } | null => {
const lowerTitle = title.toLowerCase();
const lowerKeyword = keyword.toLowerCase();
let bestMatch: { matchText: string; startIndex: number } | null = null;
let maxLength = 0;
for (let i = 0; i < lowerTitle.length; i++) {
for (let j = i + 1; j <= lowerTitle.length; j++) {
const substring = lowerTitle.substring(i, j);
if (lowerKeyword.includes(substring) && substring.length > maxLength) {
maxLength = substring.length;
bestMatch = {
matchText: title.substring(i, j),
startIndex: i
};
}
}
}
return maxLength >= 1 ? bestMatch : null;
};
4. 搜索Hooks (src/hooks/useSearch.ts)
import { ref } from 'vue';
import { mockSearchAPI } from '../mock/data';
export const useSearch = () => {
const results = ref([]);
const loading = ref(false);
const error = ref<string | null>(null);
const search = async (keyword: string) => {
if (!keyword.trim()) {
results.value = [];
return;
}
loading.value = true;
error.value = null;
try {
const response = await mockSearchAPI(keyword);
if (response.success) {
results.value = response.data;
} else {
error.value = '搜索失败';
}
} catch (err) {
error.value = '网络请求失败';
console.error('Search error:', err);
} finally {
loading.value = false;
}
};
return {
results,
loading,
error,
search
};
};
5. 高亮文本组件 (src/components/HighlightText.vue)
<template>
<span :class="className">
<span
v-for="(segment, index) in segments"
:key="index"
:class="segment.isHighLight ? 'highlight-text' : ''"
:style="segment.isHighLight ? highlightStyle : {}"
>
{{ segment.text }}
</span>
</span>
</template>
<script setup lang="ts">
import { computed } from 'vue';
import type { RedWord, TextSegment } from '../types/search';
import { matchSplitItemTitle } from '../utils/highlightAlgorithms';
interface Props {
content: string;
redWords?: RedWord[];
keyword?: string;
className?: string;
}
const props = withDefaults(defineProps<Props>(), {
redWords: () => [],
keyword: '',
className: ''
});
const segments = computed<TextSegment[]>(() =>
matchSplitItemTitle(props.content, props.redWords, 'A1')
);
const highlightStyle = {
color: '#f00',
fontWeight: '600'
};
</script>
<style scoped>
.highlight-text {
color: #f00;
font-weight: 600;
}
</style>
6. 搜索结果组件 (src/components/SearchResults.vue)
<template>
<div>
<div v-if="loading">搜索中...</div>
<div v-else-if="error">{{ error }}</div>
<div v-else-if="results.length === 0">暂无搜索结果</div>
<div v-else>
<div v-for="item in results" :key="item.id">
<div>
<HighlightText :content="item.content" :red-words="item.query_addition_info.red_words" />
<div>{{ item.query_addition_info.show_text }}</div>
</div>
</div>
</div>
</div>
</template>
<script setup lang="ts">
import { type SearchItem } from '../types/search';
import HighlightText from './HighlightText.vue';
interface Props {
results: SearchItem[];
loading: boolean;
error: string | null;
}
defineProps<Props>();
</script>
7. 搜索框组件 (src/components/SearchBox.vue)
<template>
<form @submit.prevent="handleSubmit">
<div>
<input
v-model="keyword"
type="text"
placeholder="输入药品名称搜索..."
:disabled="loading"
/>
<button type="button" @click="handleClear" v-if="keyword.trim()">clear</button>
<button
type="submit"
:disabled="loading || !keyword.trim()"
>
{{ loading ? '搜索中...' : '搜索' }}
</button>
</div>
</form>
</template>
<script setup lang="ts">
import { ref } from 'vue';
interface Emits {
(e: 'search', keyword: string): void;
}
const emit = defineEmits<Emits>();
interface Props {
loading: boolean;
}
defineProps<Props>();
const keyword = ref('');
const handleSubmit = () => {
if (keyword.value.trim()) {
emit('search', keyword.value.trim());
}
};
const handleClear = () => {
keyword.value = '';
emit('search', keyword.value.trim());
}
</script>
8. 主应用组件 (src/App.vue)
<template>
<div class="app">
<SearchBox :loading="loading" @search="handleSearch" />
<SearchResults :results="results" :loading="loading" :error="error" />
</div>
</template>
<script setup lang="ts">
import SearchBox from './components/SearchBox.vue';
import SearchResults from './components/SearchResults.vue';
import { useSearch } from './composables/useSearch';
const { results, loading, error, search } = useSearch();
const handleSearch = (keyword: string) => {
search(keyword);
};
</script>