系统设计实战 184:图数据库

5 阅读11分钟

🚀 系统设计实战 184:图数据库

摘要:本文深入剖析系统的核心架构关键算法工程实践,提供完整的设计方案和面试要点。

你是否想过,设计图数据库进阶版背后的技术挑战有多复杂?

1. 系统概述

1.1 业务背景

图数据库专门用于存储和查询图结构数据,如社交网络、推荐系统、知识图谱等。系统需要支持复杂的图遍历查询、路径分析和图算法计算。

1.2 核心功能

  • 图存储模型:节点、边、属性的高效存储
  • 图遍历算法:深度优先、广度优先、最短路径
  • 查询语言:Cypher、Gremlin等图查询语言
  • 索引优化:节点索引、边索引、全文索引
  • 分布式图:图分片、跨分片查询、一致性保证

1.3 技术挑战

  • 图遍历性能:大规模图的高效遍历和查询
  • 存储优化:图数据的紧凑存储和快速访问
  • 查询优化:复杂图查询的执行计划优化
  • 分布式处理:图数据的分片和跨分片查询
  • 一致性保证:分布式图操作的事务一致性

2. 架构设计

2.1 整体架构

┌─────────────────────────────────────────────────────────────┐
│                    图数据库架构                              │
├─────────────────────────────────────────────────────────────┤
│  Query Layer                                                │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
│  │ 查询解析器  │ │ 执行引擎    │ │ 结果处理    │           │
│  └─────────────┘ └─────────────┘ └─────────────┘           │
├─────────────────────────────────────────────────────────────┤
│  Graph Engine                                               │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
│  │ 图遍历引擎  │ │ 索引管理    │ │ 事务管理    │           │
│  └─────────────┘ └─────────────┘ └─────────────┘           │
├─────────────────────────────────────────────────────────────┤
│  Storage Layer                                              │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
│  │ 节点存储    │ │ 边存储      │ │ 属性存储    │           │
│  └─────────────┘ └─────────────┘ └─────────────┘           │
└─────────────────────────────────────────────────────────────┘

3. 核心组件设计

3.1 图存储引擎

// 时间复杂度:O(N),空间复杂度:O(1)

type GraphStorageEngine struct {
    nodeStore     *NodeStore
    edgeStore     *EdgeStore
    propertyStore *PropertyStore
    indexManager  *GraphIndexManager
    txnManager    *TransactionManager
}

type Node struct {
    ID         NodeID
    Labels     []string
    Properties map[string]interface{}
    InEdges    []EdgeID
    OutEdges   []EdgeID
}

type Edge struct {
    ID         EdgeID
    Type       string
    FromNode   NodeID
    ToNode     NodeID
    Properties map[string]interface{}
}

type NodeStore struct { storage map[NodeID]*Node freeList []NodeID nextID NodeID mutex sync.RWMutex }

func (ns *NodeStore) CreateNode(labels []string, properties map[string]interface{}) (*Node, error) { ns.mutex.Lock() defer ns.mutex.Unlock()

var nodeID NodeID
if len(ns.freeList) > 0 {
    nodeID = ns.freeList[len(ns.freeList)-1]
    ns.freeList = ns.freeList[:len(ns.freeList)-1]
} else {
    nodeID = ns.nextID
    ns.nextID++
}

node := &Node{
    ID:         nodeID,
    Labels:     labels,
    Properties: properties,
    InEdges:    make([]EdgeID, 0),
    OutEdges:   make([]EdgeID, 0),
}

ns.storage[nodeID] = node
return node, nil

}

func (ns *NodeStore) GetNode(nodeID NodeID) (*Node, error) { ns.mutex.RLock() defer ns.mutex.RUnlock()

node, exists := ns.storage[nodeID]
if !exists {
    return nil, ErrNodeNotFound
}

return node, nil

}

type EdgeStore struct { storage map[EdgeID]*Edge outEdgeIndex map[NodeID][]EdgeID inEdgeIndex map[NodeID][]EdgeID typeIndex map[string][]EdgeID freeList []EdgeID nextID EdgeID mutex sync.RWMutex }

func (es *EdgeStore) CreateEdge(edgeType string, fromNode, toNode NodeID, properties map[string]interface{}) (*Edge, error) { es.mutex.Lock() defer es.mutex.Unlock()

var edgeID EdgeID
if len(es.freeList) > 0 {
    edgeID = es.freeList[len(es.freeList)-1]
    es.freeList = es.freeList[:len(es.freeList)-1]
} else {
    edgeID = es.nextID
    es.nextID++
}

edge := &Edge{
    ID:         edgeID,
    Type:       edgeType,
    FromNode:   fromNode,
    ToNode:     toNode,
    Properties: properties,
}

es.storage[edgeID] = edge

// 更新索引
es.outEdgeIndex[fromNode] = append(es.outEdgeIndex[fromNode], edgeID)
es.inEdgeIndex[toNode] = append(es.inEdgeIndex[toNode], edgeID)
es.typeIndex[edgeType] = append(es.typeIndex[edgeType], edgeID)

return edge, nil

}

func (es *EdgeStore) GetOutgoingEdges(nodeID NodeID) ([]*Edge, error) { es.mutex.RLock() defer es.mutex.RUnlock()

edgeIDs, exists := es.outEdgeIndex[nodeID]
if !exists {
    return []*Edge{}, nil
}

edges := make([]*Edge, 0, len(edgeIDs))
for _, edgeID := range edgeIDs {
    if edge, exists := es.storage[edgeID]; exists {
        edges = append(edges, edge)
    }
}

return edges, nil

}

3.2 图遍历引擎

type GraphTraversalEngine struct {
    storage     *GraphStorageEngine
    pathCache   *PathCache
    algorithms  map[string]TraversalAlgorithm
}

type TraversalAlgorithm interface {
    Traverse(startNode NodeID, condition TraversalCondition) (*TraversalResult, error)
}

type TraversalCondition struct {
    MaxDepth      int
    EdgeTypes     []string
    NodeLabels    []string
    PropertyFilter map[string]interface{}
    Direction     TraversalDirection
}

type BreadthFirstTraversal struct {
    engine *GraphTraversalEngine
}

func (bft *BreadthFirstTraversal) Traverse(startNode NodeID, condition TraversalCondition) (*TraversalResult, error) {
    visited := make(map[NodeID]bool)
    queue := []NodeID{startNode}
    result := &TraversalResult{
        Nodes: make([]*Node, 0),
        Edges: make([]*Edge, 0),
        Paths: make([]*Path, 0),
    }
    
    depth := 0
    
    for len(queue) > 0 && depth < condition.MaxDepth {
        levelSize := len(queue)
        
        for i := 0; i < levelSize; i++ {
            currentNodeID := queue[0]
            queue = queue[1:]
            
            if visited[currentNodeID] {
                continue
            }
            visited[currentNodeID] = true
            
            // 获取当前节点
            node, err := bft.engine.storage.nodeStore.GetNode(currentNodeID)
            if err != nil {
                continue
            }
            
            // 检查节点是否满足条件
            if bft.matchesNodeCondition(node, condition) {
                result.Nodes = append(result.Nodes, node)
            }
            
            // 获取邻接边
            var edges []*Edge
            switch condition.Direction {
            case TraversalDirectionOut:
                edges, _ = bft.engine.storage.edgeStore.GetOutgoingEdges(currentNodeID)
            case TraversalDirectionIn:
                edges, _ = bft.engine.storage.edgeStore.GetIncomingEdges(currentNodeID)
            case TraversalDirectionBoth:
                outEdges, _ := bft.engine.storage.edgeStore.GetOutgoingEdges(currentNodeID)
                inEdges, _ := bft.engine.storage.edgeStore.GetIncomingEdges(currentNodeID)
                edges = append(outEdges, inEdges...)
            }
            
            // 遍历邻接边
            for _, edge := range edges {
                if bft.matchesEdgeCondition(edge, condition) {
                    result.Edges = append(result.Edges, edge)
                    
                    // 添加邻接节点到队列
                    var nextNodeID NodeID
                    if edge.FromNode == currentNodeID {
                        nextNodeID = edge.ToNode
                    } else {
                        nextNodeID = edge.FromNode
                    }
                    
                    if !visited[nextNodeID] {
                        queue = append(queue, nextNodeID)
                    }
                }
            }
        }
        
        depth++
    }
    
    return result, nil
}

type ShortestPathAlgorithm struct {
    engine *GraphTraversalEngine
}

func (spa *ShortestPathAlgorithm) FindShortestPath(startNode, endNode NodeID, condition TraversalCondition) (*Path, error) {
    // Dijkstra算法实现
    distances := make(map[NodeID]float64)
    previous := make(map[NodeID]NodeID)
    visited := make(map[NodeID]bool)
    
    // 初始化距离
    distances[startNode] = 0
    
    // 优先队列
    pq := &PriorityQueue{}
    heap.Init(pq)
    heap.Push(pq, &PQItem{NodeID: startNode, Distance: 0})
    
    for pq.Len() > 0 {
        current := heap.Pop(pq).(*PQItem)
        currentNodeID := current.NodeID
        
        if visited[currentNodeID] {
            continue
        }
        visited[currentNodeID] = true
        
        if currentNodeID == endNode {
            break
        }
        
        // 获取邻接边
        edges, err := spa.engine.storage.edgeStore.GetOutgoingEdges(currentNodeID)
        if err != nil {
            continue
        }
        
        for _, edge := range edges {
            if !spa.matchesEdgeCondition(edge, condition) {
                continue
            }
            
            neighborID := edge.ToNode
            if visited[neighborID] {
                continue
            }
            
            // 计算边权重
            weight := spa.calculateEdgeWeight(edge)
            newDistance := distances[currentNodeID] + weight
            
            if oldDistance, exists := distances[neighborID]; !exists || newDistance < oldDistance {
                distances[neighborID] = newDistance
                previous[neighborID] = currentNodeID
                heap.Push(pq, &PQItem{NodeID: neighborID, Distance: newDistance})
            }
        }
    }
    
    // 重构路径
    if _, exists := distances[endNode]; !exists {
        return nil, ErrPathNotFound
    }
    
    path := &Path{
        Nodes: make([]NodeID, 0),
        Edges: make([]EdgeID, 0),
        Cost:  distances[endNode],
    }
    
    // 从终点回溯到起点
    current := endNode
    for current != startNode {
        path.Nodes = append([]NodeID{current}, path.Nodes...)
        prev := previous[current]
        
        // 找到连接边
        edge := spa.findEdgeBetween(prev, current)
        if edge != nil {
            path.Edges = append([]EdgeID{edge.ID}, path.Edges...)
        }
        
        current = prev
    }
    path.Nodes = append([]NodeID{startNode}, path.Nodes...)
    
    return path, nil
}

### 3.3 查询语言处理器
```go
type CypherQueryProcessor struct {
    parser    *CypherParser
    planner   *QueryPlanner
    executor  *QueryExecutor
    optimizer *QueryOptimizer
}

type CypherQuery struct {
    Match     []*MatchClause
    Where     *WhereClause
    Return    *ReturnClause
    OrderBy   *OrderByClause
    Limit     *LimitClause
    Create    []*CreateClause
    Set       []*SetClause
    Delete    []*DeleteClause
}

type MatchClause struct {
    Pattern   *GraphPattern
    Optional  bool
}

type GraphPattern struct {
    Nodes         []*NodePattern
    Relationships []*RelationshipPattern
}

type NodePattern struct {
    Variable   string
    Labels     []string
    Properties map[string]interface{}
}

type RelationshipPattern struct {
    Variable   string
    Type       string
    Direction  RelationshipDirection
    Properties map[string]interface{}
    StartNode  string
    EndNode    string
}

func (cqp *CypherQueryProcessor) ExecuteQuery(cypherQuery string) (*QueryResult, error) {
    // 1. 解析Cypher查询
    query, err := cqp.parser.Parse(cypherQuery)
    if err != nil {
        return nil, err
    }
    
    // 2. 查询优化
    optimizedQuery := cqp.optimizer.Optimize(query)
    
    // 3. 生成执行计划
    plan, err := cqp.planner.CreateExecutionPlan(optimizedQuery)
    if err != nil {
        return nil, err
    }
    
    // 4. 执行查询
    return cqp.executor.Execute(plan)
}

type QueryExecutor struct {
    engine    *GraphTraversalEngine
    operators map[string]QueryOperator
}

type QueryOperator interface {
    Execute(context *ExecutionContext) (*OperatorResult, error)
}

type NodeScanOperator struct {
    labels     []string
    properties map[string]interface{}
    variable   string
}

func (nso *NodeScanOperator) Execute(context *ExecutionContext) (*OperatorResult, error) {
    result := &OperatorResult{
        Records: make([]*Record, 0),
    }
    
    // 扫描所有节点
    nodes := context.Engine.storage.nodeStore.GetAllNodes()
    
    for _, node := range nodes {
        if nso.matchesNode(node) {
            record := &Record{
                Variables: map[string]interface{}{
                    nso.variable: node,
                },
            }
            result.Records = append(result.Records, record)
        }
    }
    
    return result, nil
}

type ExpandOperator struct {
    startVariable string
    endVariable   string
    edgeVariable  string
    edgeTypes     []string
    direction     TraversalDirection
    minHops       int
    maxHops       int
}

func (eo *ExpandOperator) Execute(context *ExecutionContext) (*OperatorResult, error) {
    result := &OperatorResult{
        Records: make([]*Record, 0),
    }
    
    for _, inputRecord := range context.InputRecords {
        startNode := inputRecord.Variables[eo.startVariable].(*Node)
        
        // 执行图遍历
        condition := TraversalCondition{
            MaxDepth:   eo.maxHops,
            EdgeTypes:  eo.edgeTypes,
            Direction:  eo.direction,
        }
        
        traversalResult, err := context.Engine.algorithms["bfs"].Traverse(startNode.ID, condition)
        if err != nil {
            continue
        }
        
        // 为每个遍历结果创建记录
        for _, path := range traversalResult.Paths {
            if len(path.Nodes) >= eo.minHops+1 {
                record := inputRecord.Copy()
                record.Variables[eo.endVariable] = path.Nodes[len(path.Nodes)-1]
                if eo.edgeVariable != "" {
                    record.Variables[eo.edgeVariable] = path.Edges
                }
                result.Records = append(result.Records, record)
            }
        }
    }
    
    return result, nil
}

### 3.4 索引管理
```go
type GraphIndexManager struct {
    nodeIndexes     map[string]*NodeIndex
    edgeIndexes     map[string]*EdgeIndex
    fullTextIndexes map[string]*FullTextIndex
    compositeIndexes map[string]*CompositeIndex
}

type NodeIndex struct {
    label      string
    property   string
    indexType  IndexType
    btree      *BTree
    hashIndex  map[interface{}][]NodeID
}

func (gim *GraphIndexManager) CreateNodeIndex(label, property string, indexType IndexType) error {
    indexKey := fmt.Sprintf("%s.%s", label, property)
    
    index := &NodeIndex{
        label:     label,
        property:  property,
        indexType: indexType,
    }
    
    switch indexType {
    case IndexTypeBTree:
        index.btree = NewBTree(64)
    case IndexTypeHash:
        index.hashIndex = make(map[interface{}][]NodeID)
    }
    
    gim.nodeIndexes[indexKey] = index
    
    // 为现有节点构建索引
    return gim.buildNodeIndex(index)
}

func (gim *GraphIndexManager) buildNodeIndex(index *NodeIndex) error {
    // 扫描所有节点构建索引
    nodes := gim.getAllNodesWithLabel(index.label)
    
    for _, node := range nodes {
        if value, exists := node.Properties[index.property]; exists {
            gim.addToNodeIndex(index, value, node.ID)
        }
    }
    
    return nil
}

func (gim *GraphIndexManager) addToNodeIndex(index *NodeIndex, value interface{}, nodeID NodeID) {
    switch index.indexType {
    case IndexTypeBTree:
        index.btree.Insert(value, nodeID)
    case IndexTypeHash:
        index.hashIndex[value] = append(index.hashIndex[value], nodeID)
    }
}

func (gim *GraphIndexManager) QueryNodeIndex(label, property string, value interface{}) ([]NodeID, error) {
    indexKey := fmt.Sprintf("%s.%s", label, property)
    index, exists := gim.nodeIndexes[indexKey]
    if !exists {
        return nil, ErrIndexNotFound
    }
    
    switch index.indexType {
    case IndexTypeBTree:
        return index.btree.Find(value), nil
    case IndexTypeHash:
        return index.hashIndex[value], nil
    }
    
    return nil, ErrUnsupportedIndexType
}

type FullTextIndex struct {
    label      string
    properties []string
    invertedIndex map[string][]NodeID
    analyzer   *TextAnalyzer
}

func (fti *FullTextIndex) AddDocument(nodeID NodeID, text string) {
    tokens := fti.analyzer.Analyze(text)
    
    for _, token := range tokens {
        fti.invertedIndex[token] = append(fti.invertedIndex[token], nodeID)
    }
}

func (fti *FullTextIndex) Search(query string) ([]NodeID, error) {
    tokens := fti.analyzer.Analyze(query)
    
    if len(tokens) == 0 {
        return []NodeID{}, nil
    }
    
    // 获取第一个词的结果
    result := fti.invertedIndex[tokens[0]]
    
    // 与其他词的结果求交集
    for i := 1; i < len(tokens); i++ {
        tokenResults := fti.invertedIndex[tokens[i]]
        result = fti.intersect(result, tokenResults)
    }
    
    return result, nil
}

func (fti *FullTextIndex) intersect(list1, list2 []NodeID) []NodeID {
    result := make([]NodeID, 0)
    i, j := 0, 0
    
    for i < len(list1) && j < len(list2) {
        if list1[i] == list2[j] {
            result = append(result, list1[i])
            i++
            j++
        } else if list1[i] < list2[j] {
            i++
        } else {
            j++
        }
    }
    
    return result
}

4. 分布式图处理

4.1 图分片策略

type GraphPartitioner struct {
    strategy      PartitionStrategy
    partitionMap  map[NodeID]PartitionID
    edgeCuts      map[EdgeID][]PartitionID
    replicaFactor int
}

type PartitionStrategy interface {
    PartitionNode(node *Node) PartitionID
    PartitionEdge(edge *Edge) []PartitionID
}

type HashPartitioner struct {
    partitionCount int
    hasher         hash.Hash32
}

func (hp *HashPartitioner) PartitionNode(node *Node) PartitionID {
    hp.hasher.Reset()
    binary.Write(hp.hasher, binary.BigEndian, node.ID)
    hash := hp.hasher.Sum32()
    return PartitionID(hash % uint32(hp.partitionCount))
}

type EdgeCutPartitioner struct {
    partitionCount int
    nodePartitions map[NodeID]PartitionID
}

func (ecp *EdgeCutPartitioner) PartitionEdge(edge *Edge) []PartitionID {
    fromPartition := ecp.nodePartitions[edge.FromNode]
    toPartition := ecp.nodePartitions[edge.ToNode]
    
    if fromPartition == toPartition {
        return []PartitionID{fromPartition}
    }
    
    // 跨分片边,需要在两个分片都存储
    return []PartitionID{fromPartition, toPartition}
}

type DistributedGraphEngine struct {
    partitioner   *GraphPartitioner
    partitions    map[PartitionID]*GraphPartition
    coordinator   *QueryCoordinator
    communicator  *PartitionCommunicator
}

type GraphPartition struct {
    id            PartitionID
    localStorage  *GraphStorageEngine
    replicaNodes  map[NodeID]*Node
    ghostNodes    map[NodeID]*Node
    messageQueue  chan *PartitionMessage
}

func (dge *DistributedGraphEngine) ExecuteDistributedQuery(query *CypherQuery) (*QueryResult, error) {
    // 1. 查询分析和分解
    subQueries := dge.coordinator.DecomposeQuery(query)
    
    // 2. 并行执行子查询
    results := make(chan *PartialResult, len(subQueries))
    
    for partitionID, subQuery := range subQueries {
        go func(pid PartitionID, sq *SubQuery) {
            partition := dge.partitions[pid]
            result, err := partition.ExecuteSubQuery(sq)
            results <- &PartialResult{
                PartitionID: pid,
                Result:      result,
                Error:       err,
            }
        }(partitionID, subQuery)
    }
    
    // 3. 收集和合并结果
    partialResults := make([]*PartialResult, 0, len(subQueries))
    for i := 0; i < len(subQueries); i++ {
        partialResults = append(partialResults, <-results)
    }
    
    // 4. 合并最终结果
    return dge.coordinator.MergeResults(partialResults)
}

### 4.2 跨分片查询处理
```go
type QueryCoordinator struct {
    partitioner   *GraphPartitioner
    optimizer     *DistributedQueryOptimizer
    communicator  *PartitionCommunicator
}

func (qc *QueryCoordinator) DecomposeQuery(query *CypherQuery) map[PartitionID]*SubQuery {
    subQueries := make(map[PartitionID]*SubQuery)
    
    // 分析查询涉及的分片
    involvedPartitions := qc.analyzeQueryPartitions(query)
    
    for _, partitionID := range involvedPartitions {
        subQuery := &SubQuery{
            OriginalQuery: query,
            PartitionID:   partitionID,
            LocalOps:      make([]*QueryOperation, 0),
            RemoteOps:     make([]*QueryOperation, 0),
        }
        
        // 为每个分片生成本地操作
        qc.generateLocalOperations(subQuery, query)
        
        subQueries[partitionID] = subQuery
    }
    
    return subQueries
}

type PartitionCommunicator struct {
    connections map[PartitionID]*grpc.ClientConn
    messagePool sync.Pool
}

func (pc *PartitionCommunicator) SendMessage(targetPartition PartitionID, message *PartitionMessage) error {
    conn, exists := pc.connections[targetPartition]
    if !exists {
        return ErrPartitionNotConnected
    }
    
    client := NewGraphServiceClient(conn)
    ctx, cancel := context.WithTimeout(context.Background(), time.Second*30)
    defer cancel()
    
    _, err := client.ProcessMessage(ctx, message)
    return err
}

func (pc *PartitionCommunicator) BroadcastMessage(message *PartitionMessage) error {
    errors := make([]error, 0)
    
    for partitionID := range pc.connections {
        if err := pc.SendMessage(partitionID, message); err != nil {
            errors = append(errors, err)
        }
    }
    
    if len(errors) > 0 {
        return fmt.Errorf("broadcast failed: %v", errors)
    }
    
    return nil
}

图数据库通过专门的图存储模型、高效的遍历算法和分布式处理能力,为复杂的图数据查询和分析提供了强大的支持。


🎯 场景引入

你打开App,

你打开手机准备使用设计图数据库进阶版服务。看似简单的操作背后,系统面临三大核心挑战:

  • 挑战一:高并发——如何在百万级 QPS 下保持低延迟?
  • 挑战二:高可用——如何在节点故障时保证服务不中断?
  • 挑战三:数据一致性——如何在分布式环境下保证数据正确?

📈 容量估算

假设 DAU 1000 万,人均日请求 50 次

指标数值
数据总量10 TB+
日写入量~100 GB
写入 TPS~5 万/秒
读取 QPS~20 万/秒
P99 读延迟< 10ms
节点数10-50
副本因子3

❓ 高频面试问题

Q1:图数据库的核心设计原则是什么?

参考正文中的架构设计部分,核心原则包括:高可用(故障自动恢复)、高性能(低延迟高吞吐)、可扩展(水平扩展能力)、一致性(数据正确性保证)。面试时需结合具体场景展开。

Q2:图数据库在大规模场景下的主要挑战是什么?

  1. 性能瓶颈:随着数据量和请求量增长,单节点无法承载;2) 一致性:分布式环境下的数据一致性保证;3) 故障恢复:节点故障时的自动切换和数据恢复;4) 运维复杂度:集群管理、监控、升级。

Q3:如何保证图数据库的高可用?

  1. 多副本冗余(至少 3 副本);2) 自动故障检测和切换(心跳 + 选主);3) 数据持久化和备份;4) 限流降级(防止雪崩);5) 多机房/多活部署。

Q4:图数据库的性能优化有哪些关键手段?

  1. 缓存(减少重复计算和 IO);2) 异步处理(非关键路径异步化);3) 批量操作(减少网络往返);4) 数据分片(并行处理);5) 连接池复用。

Q5:图数据库与同类方案相比有什么优劣势?

参考方案对比表格。选型时需考虑:团队技术栈、数据规模、延迟要求、一致性需求、运维成本。没有银弹,需根据业务场景权衡取舍。



| 方案一 | 简单实现 | 低 | 适合小规模 | | 方案二 | 中等复杂度 | 中 | 适合中等规模 | | 方案三 | 高复杂度 ⭐推荐 | 高 | 适合大规模生产环境 |

🚀 架构演进路径

阶段一:单机版 MVP(用户量 < 10 万)

  • 单体应用 + 单机数据库
  • 功能验证优先,快速迭代
  • 适用场景:产品早期验证

阶段二:基础版分布式(用户量 10 万 - 100 万)

  • 应用层水平扩展(无状态服务 + 负载均衡)
  • 数据库主从分离(读写分离)
  • 引入 Redis 缓存热点数据
  • 适用场景:业务增长期

阶段三:生产级高可用(用户量 > 100 万)

  • 微服务拆分,独立部署和扩缩容
  • 数据库分库分表(按业务维度分片)
  • 引入消息队列解耦异步流程
  • 多机房部署,异地容灾
  • 全链路监控 + 自动化运维

✅ 架构设计检查清单

检查项状态说明
高可用多副本部署,自动故障转移,99.9% SLA
可扩展无状态服务水平扩展,数据层分片
数据一致性核心路径强一致,非核心最终一致
安全防护认证授权 + 加密 + 审计日志
监控告警Metrics + Logging + Tracing 三支柱
容灾备份多机房部署,定期备份,RPO < 1 分钟
性能优化多级缓存 + 异步处理 + 连接池
灰度发布支持按用户/地域灰度,快速回滚

⚖️ 关键 Trade-off 分析

🔴 Trade-off 1:一致性 vs 可用性

  • 强一致(CP):适用于金融交易等不能出错的场景
  • 高可用(AP):适用于社交动态等允许短暂不一致的场景
  • 本系统选择:核心路径强一致,非核心路径最终一致

🔴 Trade-off 2:同步 vs 异步

  • 同步处理:延迟低但吞吐受限,适用于核心交互路径
  • 异步处理:吞吐高但增加延迟,适用于后台计算
  • 本系统选择:核心路径同步,非核心路径异步