【剪映小助手源码精讲】11_文本与字幕系统

58 阅读25分钟

第11章:文本与字幕系统

11.1 文本与字幕系统概述

文本与字幕系统是视频编辑软件中不可或缺的功能模块,它不仅关系到视频内容的可访问性,还直接影响用户的观看体验。剪映小助手的文本与字幕系统采用模块化设计,支持多语言、多样式、多动画效果的文本处理,为用户提供了强大的文本编辑能力。

11.1.1 系统架构设计

文本与字幕系统的核心架构基于以下几个关键设计原则:

分层架构设计

  • 数据层:负责文本内容的存储和管理
  • 业务层:处理文本的创建、编辑、样式应用等逻辑
  • 渲染层:负责文本的最终渲染和输出
  • 接口层:提供统一的API接口供外部调用

插件化扩展机制

  • 支持自定义文本效果插件
  • 提供文本样式扩展接口
  • 支持第三方字体集成
  • 允许自定义动画效果

性能优化策略

  • 文本缓存机制减少重复渲染
  • 异步处理避免界面卡顿
  • 内存池管理优化资源使用
  • GPU加速支持大规模文本处理

11.1.2 核心功能特性

文本与字幕系统提供了丰富的功能特性,满足各种视频制作需求:

多语言支持

  • Unicode字符集完整支持
  • 从右到左文本渲染(阿拉伯语、希伯来语)
  • 复杂文字布局(印度语、泰语等)
  • 字体回退机制确保字符显示

丰富的样式系统

  • 字体系列、大小、颜色控制
  • 粗体、斜体、下划线等文本装饰
  • 字符间距、行间距、段落间距调节
  • 文本对齐方式(左对齐、居中、右对齐、两端对齐)

视觉效果增强

  • 文本描边和边框效果
  • 阴影和发光效果
  • 渐变填充和图案填充
  • 透明度控制和混合模式

动画与交互

  • 打字机效果、淡入淡出
  • 路径动画和变形动画
  • 滚动字幕和弹幕效果
  • 交互式文本响应

11.2 文本片段架构设计

11.2.1 TextSegment 基础类设计

文本片段是字幕系统的核心组件,负责管理和渲染单个文本单元。以下是完整的TextSegment类实现:

from typing import Dict, List, Optional, Tuple, Any, Union
from dataclasses import dataclass, field
from enum import Enum
import time
import re
from PIL import Image, ImageDraw, ImageFont, ImageColor
import numpy as np
import cv2

class TextAlignment(Enum):
    """文本对齐方式"""
    LEFT = "left"
    CENTER = "center"
    RIGHT = "right"
    JUSTIFY = "justify"

class TextDirection(Enum):
    """文本方向"""
    LTR = "ltr"  # 从左到右
    RTL = "rtl"  # 从右到左
    TTB = "ttb"  # 从上到下

class TextOverflow(Enum):
    """文本溢出处理方式"""
    CLIP = "clip"           # 裁剪
    ELLIPSIS = "ellipsis"   # 省略号
    WRAP = "wrap"          # 换行
    SCROLL = "scroll"       # 滚动

@dataclass
class TextMetrics:
    """文本度量信息"""
    width: int = 0
    height: int = 0
    ascent: int = 0
    descent: int = 0
    line_height: int = 0
    character_count: int = 0
    line_count: int = 1
    bounds: Tuple[int, int, int, int] = (0, 0, 0, 0)  # x1, y1, x2, y2

@dataclass
class TextSegment:
    """文本片段类"""
    
    # 基础属性
    id: str = ""
    content: str = ""
    start_time: float = 0.0
    end_time: float = 0.0
    layer: int = 0
    visible: bool = True
    locked: bool = False
    
    # 位置和尺寸
    x: float = 0.0  # 相对位置 (0-1)
    y: float = 0.0  # 相对位置 (0-1)
    width: float = 1.0  # 相对宽度 (0-1)
    height: float = 0.1  # 相对高度 (0-1)
    rotation: float = 0.0  # 旋转角度
    scale_x: float = 1.0
    scale_y: float = 1.0
    
    # 文本属性
    font_family: str = "Arial"
    font_size: int = 24
    font_weight: str = "normal"  # normal, bold, light
    font_style: str = "normal"   # normal, italic, oblique
    text_color: str = "#FFFFFF"
    background_color: Optional[str] = None
    
    # 布局属性
    alignment: TextAlignment = TextAlignment.LEFT
    direction: TextDirection = TextDirection.LTR
    line_spacing: float = 1.2
    letter_spacing: float = 0.0
    word_spacing: float = 0.0
    
    # 边框和装饰
    stroke_width: int = 0
    stroke_color: str = "#000000"
    underline: bool = False
    strikethrough: bool = False
    
    # 效果属性
    shadow_enabled: bool = False
    shadow_color: str = "#000000"
    shadow_offset_x: int = 2
    shadow_offset_y: int = 2
    shadow_blur: int = 4
    shadow_opacity: float = 0.5
    
    # 动画属性
    animation_type: str = "none"  # none, typewriter, fade, slide, etc.
    animation_duration: float = 1.0
    animation_delay: float = 0.0
    animation_easing: str = "linear"
    
    # 内部状态
    _cached_image: Optional[Image.Image] = None
    _cached_font: Optional[ImageFont.FreeTypeFont] = None
    _text_metrics: Optional[TextMetrics] = None
    _last_render_time: float = 0.0
    _render_cache_key: str = ""
    
    def __post_init__(self):
        """初始化后处理"""
        if not self.id:
            self.id = f"text_{int(time.time() * 1000)}"
        self._validate_properties()
    
    def _validate_properties(self) -> None:
        """验证属性有效性"""
        # 验证颜色格式
        if not self._is_valid_color(self.text_color):
            self.text_color = "#FFFFFF"
        
        if self.background_color and not self._is_valid_color(self.background_color):
            self.background_color = None
        
        # 验证字体大小
        if self.font_size < 1:
            self.font_size = 1
        elif self.font_size > 200:
            self.font_size = 200
        
        # 验证位置参数
        self.x = max(0.0, min(1.0, self.x))
        self.y = max(0.0, min(1.0, self.y))
        self.width = max(0.01, min(1.0, self.width))
        self.height = max(0.01, min(1.0, self.height))
    
    def _is_valid_color(self, color: str) -> bool:
        """验证颜色格式"""
        try:
            ImageColor.getrgb(color)
            return True
        except ValueError:
            return False
    
    def is_active_at_time(self, current_time: float) -> bool:
        """判断在给定时间是否激活"""
        return (self.visible and 
                self.start_time <= current_time <= self.end_time)
    
    def get_duration(self) -> float:
        """获取持续时间"""
        return self.end_time - self.start_time
    
    def set_content(self, content: str) -> None:
        """设置文本内容"""
        if self.content != content:
            self.content = content
            self._invalidate_cache()
    
    def set_position(self, x: float, y: float) -> None:
        """设置位置"""
        if self.x != x or self.y != y:
            self.x = max(0.0, min(1.0, x))
            self.y = max(0.0, min(1.0, y))
            self._invalidate_cache()
    
    def set_size(self, width: float, height: float) -> None:
        """设置尺寸"""
        if self.width != width or self.height != height:
            self.width = max(0.01, min(1.0, width))
            self.height = max(0.01, min(1.0, height))
            self._invalidate_cache()
    
    def set_font(self, family: str, size: int, weight: str = "normal", 
                 style: str = "normal") -> None:
        """设置字体"""
        if (self.font_family != family or self.font_size != size or 
            self.font_weight != weight or self.font_style != style):
            self.font_family = family
            self.font_size = size
            self.font_weight = weight
            self.font_style = style
            self._invalidate_cache()
    
    def set_colors(self, text_color: str, background_color: Optional[str] = None) -> None:
        """设置颜色"""
        if (self.text_color != text_color or 
            self.background_color != background_color):
            if self._is_valid_color(text_color):
                self.text_color = text_color
            if background_color is None or self._is_valid_color(background_color):
                self.background_color = background_color
            self._invalidate_cache()
    
    def set_shadow(self, enabled: bool, color: str = "#000000", 
                  offset_x: int = 2, offset_y: int = 2, blur: int = 4,
                  opacity: float = 0.5) -> None:
        """设置阴影效果"""
        self.shadow_enabled = enabled
        if enabled:
            if self._is_valid_color(color):
                self.shadow_color = color
            self.shadow_offset_x = offset_x
            self.shadow_offset_y = offset_y
            self.shadow_blur = blur
            self.shadow_opacity = max(0.0, min(1.0, opacity))
        self._invalidate_cache()
    
    def set_stroke(self, width: int, color: str = "#000000") -> None:
        """设置描边效果"""
        self.stroke_width = max(0, width)
        if self.stroke_width > 0 and self._is_valid_color(color):
            self.stroke_color = color
        self._invalidate_cache()
    
    def set_animation(self, animation_type: str, duration: float = 1.0,
                     delay: float = 0.0, easing: str = "linear") -> None:
        """设置动画"""
        self.animation_type = animation_type
        self.animation_duration = max(0.1, duration)
        self.animation_delay = max(0.0, delay)
        self.animation_easing = easing
    
    def _invalidate_cache(self) -> None:
        """使缓存失效"""
        self._cached_image = None
        self._cached_font = None
        self._text_metrics = None
        self._render_cache_key = ""
    
    def _get_cache_key(self) -> str:
        """获取缓存键"""
        if not self._render_cache_key:
            key_parts = [
                self.content,
                self.font_family,
                str(self.font_size),
                self.font_weight,
                self.font_style,
                self.text_color,
                str(self.background_color),
                self.alignment.value,
                str(self.stroke_width),
                self.stroke_color,
                str(self.shadow_enabled),
                str(self.scale_x),
                str(self.scale_y)
            ]
            self._render_cache_key = "|".join(key_parts)
        return self._render_cache_key
    
    def render(self, frame_width: int, frame_height: int, 
               current_time: float = 0.0) -> Image.Image:
        """渲染文本片段"""
        if not self.is_active_at_time(current_time):
            return Image.new('RGBA', (frame_width, frame_height), (0, 0, 0, 0))
        
        # 计算动画效果
        animation_progress = self._calculate_animation_progress(current_time)
        
        # 检查缓存
        cache_key = self._get_cache_key()
        if (self._cached_image is not None and 
            self._last_render_time == current_time):
            return self._cached_image.copy()
        
        # 创建文本图像
        text_image = self._create_text_image(frame_width, frame_height)
        
        # 应用动画效果
        if animation_progress > 0:
            text_image = self._apply_animation(text_image, animation_progress)
        
        # 应用变换
        text_image = self._apply_transforms(text_image)
        
        # 缓存结果
        self._cached_image = text_image.copy()
        self._last_render_time = current_time
        
        return text_image
    
    def _create_text_image(self, frame_width: int, frame_height: int) -> Image.Image:
        """创建文本图像"""
        # 计算实际像素尺寸
        pixel_width = int(self.width * frame_width)
        pixel_height = int(self.height * frame_height)
        
        # 创建透明背景图像
        text_image = Image.new('RGBA', (pixel_width, pixel_height), (0, 0, 0, 0))
        draw = ImageDraw.Draw(text_image)
        
        # 获取字体
        font = self._get_font()
        
        # 计算文本布局
        lines = self._wrap_text(self.content, draw, font, pixel_width)
        
        # 计算文本位置
        text_metrics = self._calculate_text_metrics(lines, font, draw)
        y_position = self._calculate_vertical_position(text_metrics, pixel_height)
        
        # 绘制背景
        if self.background_color:
            self._draw_background(draw, pixel_width, pixel_height, text_metrics)
        
        # 绘制文本
        for i, line in enumerate(lines):
            line_y = y_position + i * text_metrics.line_height
            self._draw_text_line(draw, line, font, line_y, pixel_width, text_metrics)
        
        return text_image
    
    def _get_font(self) -> ImageFont.FreeTypeFont:
        """获取字体对象"""
        if self._cached_font is None:
            try:
                # 尝试加载指定字体
                font_path = self._get_font_path(self.font_family)
                self._cached_font = ImageFont.truetype(font_path, self.font_size)
            except Exception:
                # 回退到默认字体
                self._cached_font = ImageFont.load_default()
        return self._cached_font
    
    def _get_font_path(self, font_family: str) -> str:
        """获取字体文件路径"""
        # 字体映射表
        font_mappings = {
            "Arial": ["arial.ttf", "Arial.ttf", "LiberationSans-Regular.ttf"],
            "Times New Roman": ["times.ttf", "Times New Roman.ttf", "LiberationSerif-Regular.ttf"],
            "Helvetica": ["helvetica.ttf", "Helvetica.ttf", "NimbusSans-Regular.ttf"],
            "Courier New": ["cour.ttf", "Courier New.ttf", "LiberationMono-Regular.ttf"],
            "Verdana": ["verdana.ttf", "Verdana.ttf"],
            "Georgia": ["georgia.ttf", "Georgia.ttf"],
            "Impact": ["impact.ttf", "Impact.ttf"]
        }
        
        # 获取候选字体列表
        candidates = font_mappings.get(font_family, [font_family])
        
        # 尝试查找字体文件
        import os
        system_font_dirs = [
            "/usr/share/fonts/truetype",
            "/usr/share/fonts/TTF",
            "/System/Library/Fonts",
            "/Library/Fonts",
            "C:/Windows/Fonts",
            "C:/WINNT/Fonts"
        ]
        
        for candidate in candidates:
            for font_dir in system_font_dirs:
                font_path = os.path.join(font_dir, candidate)
                if os.path.exists(font_path):
                    return font_path
        
        # 如果找不到,返回默认字体
        return "arial.ttf"
    
    def _wrap_text(self, text: str, draw: ImageDraw.Draw, 
                  font: ImageFont.FreeTypeFont, max_width: int) -> List[str]:
        """文本换行处理"""
        if not text:
            return [""]
        
        lines = []
        paragraphs = text.split('\n')
        
        for paragraph in paragraphs:
            if not paragraph:
                lines.append("")
                continue
            
            words = paragraph.split()
            current_line = words[0] if words else ""
            
            for word in words[1:]:
                test_line = current_line + " " + word
                bbox = draw.textbbox((0, 0), test_line, font=font)
                line_width = bbox[2] - bbox[0]
                
                if line_width <= max_width:
                    current_line = test_line
                else:
                    lines.append(current_line)
                    current_line = word
            
            if current_line:
                lines.append(current_line)
        
        return lines
    
    def _calculate_text_metrics(self, lines: List[str], 
                               font: ImageFont.FreeTypeFont,
                               draw: ImageDraw.Draw) -> TextMetrics:
        """计算文本度量信息"""
        if self._text_metrics is None:
            metrics = TextMetrics()
            metrics.line_count = len(lines)
            metrics.character_count = sum(len(line) for line in lines)
            
            # 计算最大宽度和总高度
            max_width = 0
            total_height = 0
            
            for line in lines:
                bbox = draw.textbbox((0, 0), line, font=font)
                line_width = bbox[2] - bbox[0]
                line_height = bbox[3] - bbox[1]
                
                max_width = max(max_width, line_width)
                total_height += int(line_height * self.line_spacing)
            
            # 获取字体度量信息
            ascent, descent = font.getmetrics()
            
            metrics.width = max_width
            metrics.height = total_height or (ascent + descent)
            metrics.ascent = ascent
            metrics.descent = descent
            metrics.line_height = int((ascent + descent) * self.line_spacing)
            metrics.bounds = (0, 0, max_width, total_height)
            
            self._text_metrics = metrics
        
        return self._text_metrics
    
    def _calculate_vertical_position(self, metrics: TextMetrics, 
                                   container_height: int) -> int:
        """计算垂直位置"""
        # 根据对齐方式计算起始Y位置
        if self.alignment == TextAlignment.CENTER:
            return max(0, (container_height - metrics.height) // 2)
        elif self.alignment == TextAlignment.RIGHT:
            return max(0, container_height - metrics.height)
        else:  # LEFT or JUSTIFY
            return 0
    
    def _draw_background(self, draw: ImageDraw.Draw, width: int, height: int,
                        metrics: TextMetrics) -> None:
        """绘制背景"""
        if self.background_color:
            color = ImageColor.getrgb(self.background_color)
            draw.rectangle([0, 0, width, height], fill=color)
    
    def _draw_text_line(self, draw: ImageDraw.Draw, text: str, 
                       font: ImageFont.FreeTypeFont, y: int, 
                       container_width: int, metrics: TextMetrics) -> None:
        """绘制单行文本"""
        # 计算水平位置
        x = self._calculate_horizontal_position(text, font, draw, 
                                              container_width, metrics)
        
        # 绘制阴影
        if self.shadow_enabled:
            self._draw_shadow(draw, text, x, y, font)
        
        # 绘制描边
        if self.stroke_width > 0:
            self._draw_stroke(draw, text, x, y, font)
        
        # 绘制文本
        text_color = ImageColor.getrgb(self.text_color)
        draw.text((x, y), text, font=font, fill=text_color)
        
        # 绘制装饰线
        if self.underline or self.strikethrough:
            self._draw_decorations(draw, text, x, y, font, metrics)
    
    def _calculate_horizontal_position(self, text: str, font: ImageFont.FreeTypeFont,
                                     draw: ImageDraw.Draw, container_width: int,
                                     metrics: TextMetrics) -> int:
        """计算水平位置"""
        bbox = draw.textbbox((0, 0), text, font=font)
        text_width = bbox[2] - bbox[0]
        
        if self.alignment == TextAlignment.CENTER:
            return max(0, (container_width - text_width) // 2)
        elif self.alignment == TextAlignment.RIGHT:
            return max(0, container_width - text_width)
        else:  # LEFT or JUSTIFY
            return 0
    
    def _draw_shadow(self, draw: ImageDraw.Draw, text: str, x: int, y: int,
                    font: ImageFont.FreeTypeFont) -> None:
        """绘制阴影"""
        shadow_color = ImageColor.getrgb(self.shadow_color)
        shadow_alpha = int(self.shadow_opacity * 255)
        
        # 创建阴影图层
        shadow_image = Image.new('RGBA', (2000, 200), (0, 0, 0, 0))
        shadow_draw = ImageDraw.Draw(shadow_image)
        shadow_draw.text((0, 0), text, font=font, fill=shadow_color + (shadow_alpha,))
        
        # 应用模糊效果
        if self.shadow_blur > 0:
            shadow_image = shadow_image.filter(ImageFilter.GaussianBlur(self.shadow_blur))
        
        # 计算阴影位置
        shadow_x = x + self.shadow_offset_x
        shadow_y = y + self.shadow_offset_y
        
        # 在目标位置绘制阴影
        # 这里简化处理,实际应该将阴影图像合成到目标图像上
        draw.text((shadow_x, shadow_y), text, font=font, fill=shadow_color)
    
    def _draw_stroke(self, draw: ImageDraw.Draw, text: str, x: int, y: int,
                    font: ImageFont.FreeTypeFont) -> None:
        """绘制描边"""
        stroke_color = ImageColor.getrgb(self.stroke_color)
        
        # 绘制多层描边
        for dx in range(-self.stroke_width, self.stroke_width + 1):
            for dy in range(-self.stroke_width, self.stroke_width + 1):
                if abs(dx) + abs(dy) <= self.stroke_width:
                    draw.text((x + dx, y + dy), text, font=font, fill=stroke_color)
    
    def _draw_decorations(self, draw: ImageDraw.Draw, text: str, x: int, y: int,
                         font: ImageFont.FreeTypeFont, metrics: TextMetrics) -> None:
        """绘制装饰线"""
        bbox = draw.textbbox((x, y), text, font=font)
        
        if self.underline:
            # 下划线位置
            line_y = bbox[3] + 2
            draw.line([(bbox[0], line_y), (bbox[2], line_y)], 
                     fill=ImageColor.getrgb(self.text_color), width=1)
        
        if self.strikethrough:
            # 删除线位置
            line_y = (bbox[1] + bbox[3]) // 2
            draw.line([(bbox[0], line_y), (bbox[2], line_y)], 
                     fill=ImageColor.getrgb(self.text_color), width=1)
    
    def _calculate_animation_progress(self, current_time: float) -> float:
        """计算动画进度"""
        if self.animation_type == "none":
            return 0.0
        
        # 计算动画时间
        animation_start = self.start_time + self.animation_delay
        animation_end = animation_start + self.animation_duration
        
        if current_time < animation_start:
            return 0.0
        elif current_time > animation_end:
            return 1.0
        else:
            progress = (current_time - animation_start) / self.animation_duration
            return self._apply_easing(progress, self.animation_easing)
    
    def _apply_easing(self, progress: float, easing: str) -> float:
        """应用缓动函数"""
        if easing == "linear":
            return progress
        elif easing == "ease_in":
            return progress * progress
        elif easing == "ease_out":
            return 1 - (1 - progress) * (1 - progress)
        elif easing == "ease_in_out":
            if progress < 0.5:
                return 2 * progress * progress
            else:
                return 1 - 2 * (1 - progress) * (1 - progress)
        else:
            return progress
    
    def _apply_animation(self, image: Image.Image, progress: float) -> Image.Image:
        """应用动画效果"""
        if self.animation_type == "fade":
            return self._apply_fade_animation(image, progress)
        elif self.animation_type == "typewriter":
            return self._apply_typewriter_animation(image, progress)
        elif self.animation_type == "slide":
            return self._apply_slide_animation(image, progress)
        else:
            return image
    
    def _apply_fade_animation(self, image: Image.Image, progress: float) -> Image.Image:
        """应用淡入动画"""
        # 创建透明度渐变
        alpha_array = np.array(image)[:, :, 3]
        alpha_array = (alpha_array * progress).astype(np.uint8)
        
        # 应用新的alpha通道
        result_array = np.array(image)
        result_array[:, :, 3] = alpha_array
        
        return Image.fromarray(result_array, 'RGBA')
    
    def _apply_typewriter_animation(self, image: Image.Image, progress: float) -> Image.Image:
        """应用打字机动画"""
        # 这里简化实现,实际应该逐字符显示
        alpha_array = np.array(image)[:, :, 3]
        max_alpha = alpha_array.max()
        alpha_array = (alpha_array * progress).astype(np.uint8)
        
        result_array = np.array(image)
        result_array[:, :, 3] = alpha_array
        
        return Image.fromarray(result_array, 'RGBA')
    
    def _apply_slide_animation(self, image: Image.Image, progress: float) -> Image.Image:
        """应用滑动动画"""
        # 计算滑动偏移
        width, height = image.size
        offset_x = int(width * (1 - progress))
        
        # 创建结果图像
        result = Image.new('RGBA', (width, height), (0, 0, 0, 0))
        
        # 计算粘贴位置
        paste_x = max(0, offset_x)
        paste_width = min(width, width - offset_x)
        
        if paste_width > 0:
            # 裁剪并粘贴图像
            crop_box = (0, 0, paste_width, height)
            cropped = image.crop(crop_box)
            result.paste(cropped, (paste_x, 0))
        
        return result
    
    def _apply_transforms(self, image: Image.Image) -> Image.Image:
        """应用变换"""
        # 缩放变换
        if self.scale_x != 1.0 or self.scale_y != 1.0:
            width, height = image.size
            new_width = int(width * self.scale_x)
            new_height = int(height * self.scale_y)
            image = image.resize((new_width, new_height), Image.LANCZOS)
        
        # 旋转变换
        if self.rotation != 0.0:
            image = image.rotate(self.rotation, expand=True, fillcolor=(0, 0, 0, 0))
        
        return image
    
    def to_dict(self) -> Dict[str, Any]:
        """转换为字典"""
        return {
            "id": self.id,
            "content": self.content,
            "start_time": self.start_time,
            "end_time": self.end_time,
            "layer": self.layer,
            "visible": self.visible,
            "locked": self.locked,
            "x": self.x,
            "y": self.y,
            "width": self.width,
            "height": self.height,
            "rotation": self.rotation,
            "scale_x": self.scale_x,
            "scale_y": self.scale_y,
            "font_family": self.font_family,
            "font_size": self.font_size,
            "font_weight": self.font_weight,
            "font_style": self.font_style,
            "text_color": self.text_color,
            "background_color": self.background_color,
            "alignment": self.alignment.value,
            "direction": self.direction.value,
            "line_spacing": self.line_spacing,
            "letter_spacing": self.letter_spacing,
            "word_spacing": self.word_spacing,
            "stroke_width": self.stroke_width,
            "stroke_color": self.stroke_color,
            "underline": self.underline,
            "strikethrough": self.strikethrough,
            "shadow_enabled": self.shadow_enabled,
            "shadow_color": self.shadow_color,
            "shadow_offset_x": self.shadow_offset_x,
            "shadow_offset_y": self.shadow_offset_y,
            "shadow_blur": self.shadow_blur,
            "shadow_opacity": self.shadow_opacity,
            "animation_type": self.animation_type,
            "animation_duration": self.animation_duration,
            "animation_delay": self.animation_delay,
            "animation_easing": self.animation_easing
        }
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'TextSegment':
        """从字典创建"""
        # 转换枚举值
        if "alignment" in data:
            data["alignment"] = TextAlignment(data["alignment"])
        if "direction" in data:
            data["direction"] = TextDirection(data["direction"])
        
        # 移除缓存相关的字段
        data.pop("_cached_image", None)
        data.pop("_cached_font", None)
        data.pop("_text_metrics", None)
        data.pop("_last_render_time", None)
        data.pop("_render_cache_key", None)
        
        return cls(**data)
    
    def copy(self) -> 'TextSegment':
        """创建副本"""
        data = self.to_dict()
        data["id"] = f"{self.id}_copy"
        return TextSegment.from_dict(data)
    
    def __str__(self) -> str:
        """字符串表示"""
        return f"TextSegment(id='{self.id}', content='{self.content[:20]}...', " \
               f"time={self.start_time:.1f}-{self.end_time:.1f}s)"
    
    def __repr__(self) -> str:
        """详细字符串表示"""
        return f"TextSegment(id='{self.id}', content='{self.content}', " \
               f"font='{self.font_family}', size={self.font_size}, " \
               f"pos=({self.x:.2f}, {self.y:.2f}), size=({self.width:.2f}, {self.height:.2f}))"

11.2.2 文本样式系统设计

文本样式系统负责管理和应用各种文本样式,提供了丰富的样式选项和灵活的配置方式:

@dataclass
class TextStyle:
    """文本样式类"""
    
    name: str = ""
    description: str = ""
    category: str = "custom"
    
    # 字体属性
    font_family: str = "Arial"
    font_size: int = 24
    font_weight: str = "normal"
    font_style: str = "normal"
    
    # 颜色属性
    text_color: str = "#FFFFFF"
    background_color: Optional[str] = None
    gradient_start: Optional[str] = None
    gradient_end: Optional[str] = None
    gradient_angle: float = 0.0
    
    # 效果属性
    stroke_width: int = 0
    stroke_color: str = "#000000"
    shadow_enabled: bool = False
    shadow_color: str = "#000000"
    shadow_offset_x: int = 2
    shadow_offset_y: int = 2
    shadow_blur: int = 4
    shadow_opacity: float = 0.5
    
    # 布局属性
    alignment: TextAlignment = TextAlignment.LEFT
    line_spacing: float = 1.2
    letter_spacing: float = 0.0
    word_spacing: float = 0.0
    
    # 装饰属性
    underline: bool = False
    strikethrough: bool = False
    overline: bool = False
    
    # 动画属性
    default_animation: str = "none"
    animation_duration: float = 1.0
    animation_delay: float = 0.0
    animation_easing: str = "linear"
    
    # 预设标识
    is_preset: bool = False
    preset_id: Optional[str] = None
    
    def __post_init__(self):
        """初始化后处理"""
        if not self.name:
            self.name = f"Style_{int(time.time())}"
    
    def apply_to_segment(self, segment: TextSegment) -> None:
        """应用样式到文本片段"""
        # 应用字体属性
        segment.font_family = self.font_family
        segment.font_size = self.font_size
        segment.font_weight = self.font_weight
        segment.font_style = self.font_style
        
        # 应用颜色属性
        segment.text_color = self.text_color
        segment.background_color = self.background_color
        
        # 应用效果属性
        segment.stroke_width = self.stroke_width
        segment.stroke_color = self.stroke_color
        segment.shadow_enabled = self.shadow_enabled
        segment.shadow_color = self.shadow_color
        segment.shadow_offset_x = self.shadow_offset_x
        segment.shadow_offset_y = self.shadow_offset_y
        segment.shadow_blur = self.shadow_blur
        segment.shadow_opacity = self.shadow_opacity
        
        # 应用布局属性
        segment.alignment = self.alignment
        segment.line_spacing = self.line_spacing
        segment.letter_spacing = self.letter_spacing
        segment.word_spacing = self.word_spacing
        
        # 应用装饰属性
        segment.underline = self.underline
        segment.strikethrough = self.strikethrough
        
        # 应用动画属性
        segment.animation_type = self.default_animation
        segment.animation_duration = self.animation_duration
        segment.animation_delay = self.animation_delay
        segment.animation_easing = self.animation_easing
    
    def copy(self) -> 'TextStyle':
        """创建样式副本"""
        return TextStyle(
            name=f"{self.name}_copy",
            description=self.description,
            category=self.category,
            font_family=self.font_family,
            font_size=self.font_size,
            font_weight=self.font_weight,
            font_style=self.font_style,
            text_color=self.text_color,
            background_color=self.background_color,
            gradient_start=self.gradient_start,
            gradient_end=self.gradient_end,
            gradient_angle=self.gradient_angle,
            stroke_width=self.stroke_width,
            stroke_color=self.stroke_color,
            shadow_enabled=self.shadow_enabled,
            shadow_color=self.shadow_color,
            shadow_offset_x=self.shadow_offset_x,
            shadow_offset_y=self.shadow_offset_y,
            shadow_blur=self.shadow_blur,
            shadow_opacity=self.shadow_opacity,
            alignment=self.alignment,
            line_spacing=self.line_spacing,
            letter_spacing=self.letter_spacing,
            word_spacing=self.word_spacing,
            underline=self.underline,
            strikethrough=self.strikethrough,
            overline=self.overline,
            default_animation=self.default_animation,
            animation_duration=self.animation_duration,
            animation_delay=self.animation_delay,
            animation_easing=self.animation_easing,
            is_preset=False,
            preset_id=None
        )
    
    def to_dict(self) -> Dict[str, Any]:
        """转换为字典"""
        return {
            "name": self.name,
            "description": self.description,
            "category": self.category,
            "font_family": self.font_family,
            "font_size": self.font_size,
            "font_weight": self.font_weight,
            "font_style": self.font_style,
            "text_color": self.text_color,
            "background_color": self.background_color,
            "gradient_start": self.gradient_start,
            "gradient_end": self.gradient_end,
            "gradient_angle": self.gradient_angle,
            "stroke_width": self.stroke_width,
            "stroke_color": self.stroke_color,
            "shadow_enabled": self.shadow_enabled,
            "shadow_color": self.shadow_color,
            "shadow_offset_x": self.shadow_offset_x,
            "shadow_offset_y": self.shadow_offset_y,
            "shadow_blur": self.shadow_blur,
            "shadow_opacity": self.shadow_opacity,
            "alignment": self.alignment.value,
            "line_spacing": self.line_spacing,
            "letter_spacing": self.letter_spacing,
            "word_spacing": self.word_spacing,
            "underline": self.underline,
            "strikethrough": self.strikethrough,
            "overline": self.overline,
            "default_animation": self.default_animation,
            "animation_duration": self.animation_duration,
            "animation_delay": self.animation_delay,
            "animation_easing": self.animation_easing,
            "is_preset": self.is_preset,
            "preset_id": self.preset_id
        }
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'TextStyle':
        """从字典创建"""
        if "alignment" in data:
            data["alignment"] = TextAlignment(data["alignment"])
        return cls(**data)

class TextStyleManager:
    """文本样式管理器"""
    
    def __init__(self):
        self.styles: Dict[str, TextStyle] = {}
        self.categories: Dict[str, List[str]] = {}
        self._initialize_preset_styles()
    
    def _initialize_preset_styles(self) -> None:
        """初始化预设样式"""
        preset_styles = [
            # 基础样式
            TextStyle(
                name="经典白字",
                description="经典白色字幕",
                category="basic",
                font_family="Arial",
                font_size=24,
                text_color="#FFFFFF",
                stroke_width=2,
                stroke_color="#000000",
                is_preset=True,
                preset_id="classic_white"
            ),
            
            TextStyle(
                name="经典黑字",
                description="经典黑色字幕",
                category="basic",
                font_family="Arial",
                font_size=24,
                text_color="#000000",
                background_color="#FFFFFF",
                background_opacity=0.8,
                is_preset=True,
                preset_id="classic_black"
            ),
            
            # 现代样式
            TextStyle(
                name="现代简约",
                description="现代简约风格",
                category="modern",
                font_family="Helvetica",
                font_size=28,
                text_color="#333333",
                alignment=TextAlignment.CENTER,
                letter_spacing=1.0,
                is_preset=True,
                preset_id="modern_clean"
            ),
            
            # 艺术样式
            TextStyle(
                name="霓虹发光",
                description="霓虹灯发光效果",
                category="artistic",
                font_family="Impact",
                font_size=32,
                text_color="#FF00FF",
                shadow_enabled=True,
                shadow_color="#FF00FF",
                shadow_blur=8,
                shadow_opacity=0.8,
                is_preset=True,
                preset_id="neon_glow"
            ),
            
            # 动画样式
            TextStyle(
                name="打字机效果",
                description="逐字显示的打字机效果",
                category="animation",
                font_family="Courier New",
                font_size=20,
                text_color="#00FF00",
                default_animation="typewriter",
                animation_duration=2.0,
                is_preset=True,
                preset_id="typewriter_effect"
            )
        ]
        
        # 添加预设样式
        for style in preset_styles:
            self.add_style(style)
    
    def add_style(self, style: TextStyle) -> bool:
        """添加样式"""
        if style.name in self.styles:
            return False
        
        self.styles[style.name] = style
        
        # 添加到分类索引
        if style.category not in self.categories:
            self.categories[style.category] = []
        self.categories[style.category].append(style.name)
        
        return True
    
    def remove_style(self, name: str) -> bool:
        """移除样式"""
        if name not in self.styles:
            return False
        
        style = self.styles[name]
        del self.styles[name]
        
        # 从分类索引中移除
        if style.category in self.categories:
            if name in self.categories[style.category]:
                self.categories[style.category].remove(name)
        
        return True
    
    def get_style(self, name: str) -> Optional[TextStyle]:
        """获取样式"""
        return self.styles.get(name)
    
    def get_styles_by_category(self, category: str) -> List[TextStyle]:
        """按分类获取样式"""
        style_names = self.categories.get(category, [])
        return [self.styles[name] for name in style_names if name in self.styles]
    
    def get_all_styles(self) -> List[TextStyle]:
        """获取所有样式"""
        return list(self.styles.values())
    
    def get_preset_styles(self) -> List[TextStyle]:
        """获取预设样式"""
        return [style for style in self.styles.values() if style.is_preset]
    
    def get_custom_styles(self) -> List[TextStyle]:
        """获取自定义样式"""
        return [style for style in self.styles.values() if not style.is_preset]
    
    def export_styles(self, file_path: str) -> bool:
        """导出样式"""
        try:
            import json
            styles_data = {
                name: style.to_dict() 
                for name, style in self.styles.items() 
                if not style.is_preset  # 只导出自定义样式
            }
            
            with open(file_path, 'w', encoding='utf-8') as f:
                json.dump(styles_data, f, indent=2, ensure_ascii=False)
            
            return True
        except Exception as e:
            print(f"导出样式失败: {e}")
            return False
    
    def import_styles(self, file_path: str) -> bool:
        """导入样式"""
        try:
            import json
            
            with open(file_path, 'r', encoding='utf-8') as f:
                styles_data = json.load(f)
            
            for name, style_data in styles_data.items():
                style = TextStyle.from_dict(style_data)
                style.is_preset = False  # 导入的样式标记为自定义
                self.add_style(style)
            
            return True
        except Exception as e:
            print(f"导入样式失败: {e}")
            return False

11.3 字幕轨道与时间管理

11.3.1 字幕轨道设计

字幕轨道负责管理时间轴上的所有文本片段,提供了添加、删除、移动、编辑等操作:

from typing import List, Dict, Optional, Tuple, Any
import bisect
from collections import defaultdict

class SubtitleTrack:
    """字幕轨道类"""
    
    def __init__(self, name: str = "字幕轨道", track_id: str = ""):
        self.track_id = track_id or f"track_{int(time.time() * 1000)}"
        self.name = name
        self.segments: List[TextSegment] = []
        self.visible: bool = True
        self.locked: bool = False
        self.muted: bool = False
        self.layer: int = 0
        self.color: str = "#4CAF50"
        
        # 索引优化
        self._time_index: Dict[float, List[int]] = defaultdict(list)
        self._id_index: Dict[str, int] = {}
        self._layer_index: Dict[int, List[int]] = defaultdict(list)
        
        # 缓存
        self._sorted_segments: Optional[List[TextSegment]] = None
        self._last_update_time: float = 0.0
    
    def add_segment(self, segment: TextSegment) -> bool:
        """添加文本片段"""
        if self.locked:
            return False
        
        if segment.id in self._id_index:
            return False
        
        # 添加到列表
        self.segments.append(segment)
        index = len(self.segments) - 1
        
        # 更新索引
        self._id_index[segment.id] = index
        self._time_index[segment.start_time].append(index)
        self._layer_index[segment.layer].append(index)
        
        # 使缓存失效
        self._invalidate_cache()
        
        return True
    
    def remove_segment(self, segment_id: str) -> bool:
        """移除文本片段"""
        if self.locked or segment_id not in self._id_index:
            return False
        
        index = self._id_index[segment_id]
        segment = self.segments[index]
        
        # 从列表中移除
        self.segments.pop(index)
        
        # 更新索引
        del self._id_index[segment_id]
        self._time_index[segment.start_time].remove(index)
        if not self._time_index[segment.start_time]:
            del self._time_index[segment.start_time]
        
        self._layer_index[segment.layer].remove(index)
        if not self._layer_index[segment.layer]:
            del self._layer_index[segment.layer]
        
        # 重建索引
        self._rebuild_indices()
        
        # 使缓存失效
        self._invalidate_cache()
        
        return True
    
    def get_segment(self, segment_id: str) -> Optional[TextSegment]:
        """获取文本片段"""
        if segment_id not in self._id_index:
            return None
        
        index = self._id_index[segment_id]
        return self.segments[index]
    
    def get_segments_at_time(self, time: float) -> List[TextSegment]:
        """获取指定时间的文本片段"""
        result = []
        
        # 使用时间索引快速查找
        if time in self._time_index:
            for index in self._time_index[time]:
                segment = self.segments[index]
                if segment.is_active_at_time(time):
                    result.append(segment)
        
        # 检查其他可能的时间段
        for segment in self.segments:
            if segment.is_active_at_time(time) and segment not in result:
                result.append(segment)
        
        # 按层级排序
        result.sort(key=lambda s: s.layer)
        
        return result
    
    def get_segments_in_range(self, start_time: float, end_time: float) -> List[TextSegment]:
        """获取时间范围内的文本片段"""
        result = []
        
        for segment in self.segments:
            if (segment.start_time < end_time and segment.end_time > start_time):
                result.append(segment)
        
        # 按开始时间和层级排序
        result.sort(key=lambda s: (s.start_time, s.layer))
        
        return result
    
    def move_segment(self, segment_id: str, new_start_time: float, 
                    new_end_time: Optional[float] = None) -> bool:
        """移动文本片段"""
        if self.locked:
            return False
        
        segment = self.get_segment(segment_id)
        if not segment:
            return False
        
        # 更新时间
        old_start_time = segment.start_time
        segment.start_time = new_start_time
        
        if new_end_time is not None:
            segment.end_time = new_end_time
        else:
            # 保持持续时间不变
            duration = segment.get_duration()
            segment.end_time = new_start_time + duration
        
        # 更新索引
        if old_start_time != new_start_time:
            self._time_index[old_start_time].remove(self._id_index[segment_id])
            if not self._time_index[old_start_time]:
                del self._time_index[old_start_time]
            
            self._time_index[new_start_time].append(self._id_index[segment_id])
        
        # 使缓存失效
        self._invalidate_cache()
        
        return True
    
    def split_segment(self, segment_id: str, split_time: float) -> Optional[str]:
        """分割文本片段"""
        if self.locked:
            return None
        
        segment = self.get_segment(segment_id)
        if not segment or not (segment.start_time < split_time < segment.end_time):
            return None
        
        # 创建新片段
        new_segment = segment.copy()
        new_segment.id = f"{segment_id}_split_{int(time.time() * 1000)}"
        
        # 调整时间
        segment.end_time = split_time
        new_segment.start_time = split_time
        
        # 添加新片段
        if self.add_segment(new_segment):
            return new_segment.id
        
        return None
    
    def merge_segments(self, segment_ids: List[str]) -> Optional[str]:
        """合并文本片段"""
        if self.locked or len(segment_ids) < 2:
            return None
        
        segments = []
        for seg_id in segment_ids:
            segment = self.get_segment(seg_id)
            if segment:
                segments.append(segment)
        
        if len(segments) < 2:
            return None
        
        # 按时间排序
        segments.sort(key=lambda s: s.start_time)
        
        # 创建合并后的片段
        merged_segment = segments[0].copy()
        merged_segment.id = f"merged_{int(time.time() * 1000)}"
        merged_segment.content = " ".join(s.content for s in segments)
        merged_segment.start_time = segments[0].start_time
        merged_segment.end_time = segments[-1].end_time
        
        # 移除原片段
        for segment in segments:
            self.remove_segment(segment.id)
        
        # 添加合并后的片段
        if self.add_segment(merged_segment):
            return merged_segment.id
        
        return None
    
    def _rebuild_indices(self) -> None:
        """重建索引"""
        self._id_index.clear()
        self._time_index.clear()
        self._layer_index.clear()
        
        for i, segment in enumerate(self.segments):
            self._id_index[segment.id] = i
            self._time_index[segment.start_time].append(i)
            self._layer_index[segment.layer].append(i)
    
    def _invalidate_cache(self) -> None:
        """使缓存失效"""
        self._sorted_segments = None
        self._last_update_time = time.time()
    
    def get_sorted_segments(self) -> List[TextSegment]:
        """获取排序后的片段列表"""
        if self._sorted_segments is None:
            self._sorted_segments = sorted(self.segments, key=lambda s: s.start_time)
        
        return self._sorted_segments
    
    def get_duration(self) -> float:
        """获取轨道持续时间"""
        if not self.segments:
            return 0.0
        
        return max(segment.end_time for segment in self.segments)
    
    def get_active_segments(self, current_time: float) -> List[TextSegment]:
        """获取当前激活的片段"""
        return [s for s in self.segments if s.is_active_at_time(current_time)]
    
    def has_overlapping_segments(self) -> bool:
        """检查是否有重叠的片段"""
        sorted_segments = self.get_sorted_segments()
        
        for i in range(len(sorted_segments) - 1):
            current = sorted_segments[i]
            next_segment = sorted_segments[i + 1]
            
            if current.end_time > next_segment.start_time:
                return True
        
        return False
    
    def resolve_overlaps(self) -> None:
        """解决片段重叠问题"""
        if self.locked:
            return
        
        sorted_segments = self.get_sorted_segments()
        
        for i in range(len(sorted_segments) - 1):
            current = sorted_segments[i]
            next_segment = sorted_segments[i + 1]
            
            if current.end_time > next_segment.start_time:
                # 调整结束时间
                current.end_time = next_segment.start_time
    
    def duplicate_segment(self, segment_id: str, offset: float = 0.0) -> Optional[str]:
        """复制文本片段"""
        if self.locked:
            return None
        
        segment = self.get_segment(segment_id)
        if not segment:
            return None
        
        # 创建副本
        new_segment = segment.copy()
        new_segment.id = f"{segment_id}_dup_{int(time.time() * 1000)}"
        
        # 应用时间偏移
        if offset != 0.0:
            new_segment.start_time += offset
            new_segment.end_time += offset
        
        # 添加新片段
        if self.add_segment(new_segment):
            return new_segment.id
        
        return None
    
    def apply_style_to_all(self, style: TextStyle) -> None:
        """应用样式到所有片段"""
        if self.locked:
            return
        
        for segment in self.segments:
            style.apply_to_segment(segment)
    
    def clear_all_segments(self) -> None:
        """清除所有片段"""
        if self.locked:
            return
        
        self.segments.clear()
        self._id_index.clear()
        self._time_index.clear()
        self._layer_index.clear()
        self._invalidate_cache()
    
    def to_dict(self) -> Dict[str, Any]:
        """转换为字典"""
        return {
            "track_id": self.track_id,
            "name": self.name,
            "visible": self.visible,
            "locked": self.locked,
            "muted": self.muted,
            "layer": self.layer,
            "color": self.color,
            "segments": [segment.to_dict() for segment in self.segments]
        }
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'SubtitleTrack':
        """从字典创建"""
        track = cls(
            name=data.get("name", "字幕轨道"),
            track_id=data.get("track_id", "")
        )
        
        track.visible = data.get("visible", True)
        track.locked = data.get("locked", False)
        track.muted = data.get("muted", False)
        track.layer = data.get("layer", 0)
        track.color = data.get("color", "#4CAF50")
        
        # 添加片段
        for segment_data in data.get("segments", []):
            segment = TextSegment.from_dict(segment_data)
            track.add_segment(segment)
        
        return track
    
    def __len__(self) -> int:
        """获取片段数量"""
        return len(self.segments)
    
    def __iter__(self):
        """迭代器"""
        return iter(self.segments)
    
    def __str__(self) -> str:
        """字符串表示"""
        return f"SubtitleTrack(name='{self.name}', segments={len(self.segments)}, " \
               f"duration={self.get_duration():.1f}s)"

11.4 字幕渲染引擎

11.4.1 高性能渲染引擎

字幕渲染引擎负责将文本片段渲染到视频帧上,提供了多种渲染模式和优化策略:

import threading
from concurrent.futures import ThreadPoolExecutor
import time
from typing import Optional, Tuple, List, Dict, Any
import numpy as np
from PIL import Image, ImageDraw, ImageFont

class SubtitleRenderer:
    """字幕渲染引擎"""
    
    def __init__(self, max_workers: int = 4):
        self.max_workers = max_workers
        self.executor = ThreadPoolExecutor(max_workers=max_workers)
        
        # 渲染缓存
        self.render_cache: Dict[str, Tuple[Image.Image, float]] = {}
        self.cache_size_limit = 1000
        self.cache_timeout = 30.0  # 30秒
        
        # 性能统计
        self.render_stats = {
            "total_renders": 0,
            "cache_hits": 0,
            "cache_misses": 0,
            "avg_render_time": 0.0,
            "total_render_time": 0.0
        }
        
        # 渲染选项
        self.enable_cache = True
        self.enable_multithreading = True
        self.quality_level = "high"  # low, medium, high
        self.antialiasing = True
        
        # 字体缓存
        self.font_cache: Dict[str, ImageFont.FreeTypeFont] = {}
        self.font_cache_limit = 50
        
        # 清理线程
        self.cleanup_thread = None
        self.stop_cleanup = False
        self._start_cleanup_thread()
    
    def _start_cleanup_thread(self) -> None:
        """启动缓存清理线程"""
        def cleanup_worker():
            while not self.stop_cleanup:
                time.sleep(10)  # 每10秒清理一次
                self._cleanup_cache()
        
        self.cleanup_thread = threading.Thread(target=cleanup_worker, daemon=True)
        self.cleanup_thread.start()
    
    def _cleanup_cache(self) -> None:
        """清理过期缓存"""
        current_time = time.time()
        
        # 清理渲染缓存
        expired_keys = []
        for key, (image, timestamp) in self.render_cache.items():
            if current_time - timestamp > self.cache_timeout:
                expired_keys.append(key)
        
        for key in expired_keys:
            del self.render_cache[key]
        
        # 限制缓存大小
        if len(self.render_cache) > self.cache_size_limit:
            # 移除最旧的缓存项
            sorted_items = sorted(self.render_cache.items(), key=lambda x: x[1][1])
            keys_to_remove = [key for key, _ in sorted_items[:len(sorted_items) - self.cache_size_limit]]
            for key in keys_to_remove:
                del self.render_cache[key]
        
        # 清理字体缓存
        if len(self.font_cache) > self.font_cache_limit:
            # 保留最近使用的字体
            sorted_fonts = sorted(self.font_cache.items(), key=lambda x: x[1][1] if isinstance(x[1], tuple) else 0)
            fonts_to_remove = len(self.font_cache) - self.font_cache_limit
            for key, _ in sorted_fonts[:fonts_to_remove]:
                del self.font_cache[key]
    
    def render_subtitle(self, segment: TextSegment, frame_size: Tuple[int, int],
                       current_time: float, background_frame: Optional[Image.Image] = None) -> Image.Image:
        """渲染字幕"""
        start_time = time.time()
        self.render_stats["total_renders"] += 1
        
        # 检查缓存
        cache_key = self._generate_cache_key(segment, frame_size, current_time)
        if self.enable_cache and cache_key in self.render_cache:
            cached_image, _ = self.render_cache[cache_key]
            self.render_stats["cache_hits"] += 1
            return cached_image.copy()
        
        self.render_stats["cache_misses"] += 1
        
        # 渲染字幕
        if self.enable_multithreading and self.max_workers > 1:
            # 多线程渲染
            future = self.executor.submit(self._render_segment_internal, 
                                          segment, frame_size, current_time, background_frame)
            result = future.result()
        else:
            # 单线程渲染
            result = self._render_segment_internal(segment, frame_size, current_time, background_frame)
        
        # 缓存结果
        if self.enable_cache:
            self.render_cache[cache_key] = (result.copy(), time.time())
        
        # 更新统计
        render_time = time.time() - start_time
        self.render_stats["total_render_time"] += render_time
        self.render_stats["avg_render_time"] = (
            self.render_stats["total_render_time"] / self.render_stats["total_renders"]
        )
        
        return result
    
    def _generate_cache_key(self, segment: TextSegment, frame_size: Tuple[int, int],
                           current_time: float) -> str:
        """生成缓存键"""
        segment_key = segment._get_cache_key()
        frame_key = f"{frame_size[0]}x{frame_size[1]}"
        time_key = f"{current_time:.3f}"
        return f"{segment_key}|{frame_key}|{time_key}"
    
    def _render_segment_internal(self, segment: TextSegment, frame_size: Tuple[int, int],
                                 current_time: float, background_frame: Optional[Image.Image]) -> Image.Image:
        """内部渲染方法"""
        # 创建基础帧
        if background_frame:
            base_frame = background_frame.copy()
        else:
            base_frame = Image.new('RGBA', frame_size, (0, 0, 0, 0))
        
        # 渲染文本
        text_image = segment.render(frame_size[0], frame_size[1], current_time)
        
        # 合成到基础帧
        if text_image.size != base_frame.size:
            # 调整文本图像大小
            text_image = text_image.resize(base_frame.size, Image.LANCZOS)
        
        # Alpha合成
        result = Image.alpha_composite(base_frame, text_image)
        
        return result
    
    def render_multiple_segments(self, segments: List[TextSegment], 
                                frame_size: Tuple[int, int], current_time: float,
                                background_frame: Optional[Image.Image] = None) -> Image.Image:
        """渲染多个字幕片段"""
        if not segments:
            return background_frame or Image.new('RGBA', frame_size, (0, 0, 0, 0))
        
        # 创建基础帧
        if background_frame:
            result = background_frame.copy()
        else:
            result = Image.new('RGBA', frame_size, (0, 0, 0, 0))
        
        # 按层级排序
        sorted_segments = sorted(segments, key=lambda s: s.layer)
        
        # 渲染每个片段
        for segment in sorted_segments:
            if segment.is_active_at_time(current_time):
                text_image = self.render_subtitle(segment, frame_size, current_time)
                result = Image.alpha_composite(result, text_image)
        
        return result
    
    def render_to_frame(self, frame: np.ndarray, segments: List[TextSegment],
                       current_time: float, position: Optional[Tuple[int, int]] = None) -> np.ndarray:
        """渲染到视频帧"""
        if not segments:
            return frame
        
        # 转换PIL图像
        frame_height, frame_width = frame.shape[:2]
        if len(frame.shape) == 3 and frame.shape[2] == 3:
            # RGB to RGBA
            pil_frame = Image.fromarray(frame, 'RGB').convert('RGBA')
        else:
            pil_frame = Image.fromarray(frame, 'RGBA')
        
        # 渲染字幕
        rendered_subtitle = self.render_multiple_segments(
            segments, (frame_width, frame_height), current_time, pil_frame
        )
        
        # 转换回numpy数组
        if len(frame.shape) == 3 and frame.shape[2] == 3:
            # RGBA to RGB
            result_array = np.array(rendered_subtitle.convert('RGB'))
        else:
            result_array = np.array(rendered_subtitle)
        
        return result_array
    
    def get_render_statistics(self) -> Dict[str, Any]:
        """获取渲染统计"""
        cache_hit_rate = 0.0
        if self.render_stats["total_renders"] > 0:
            cache_hit_rate = self.render_stats["cache_hits"] / self.render_stats["total_renders"]
        
        return {
            "total_renders": self.render_stats["total_renders"],
            "cache_hits": self.render_stats["cache_hits"],
            "cache_misses": self.render_stats["cache_misses"],
            "cache_hit_rate": cache_hit_rate,
            "avg_render_time": self.render_stats["avg_render_time"],
            "total_render_time": self.render_stats["total_render_time"],
            "cache_size": len(self.render_cache),
            "font_cache_size": len(self.font_cache)
        }
    
    def clear_cache(self) -> None:
        """清空缓存"""
        self.render_cache.clear()
        self.font_cache.clear()
        self.render_stats = {
            "total_renders": 0,
            "cache_hits": 0,
            "cache_misses": 0,
            "avg_render_time": 0.0,
            "total_render_time": 0.0
        }
    
    def shutdown(self) -> None:
        """关闭渲染器"""
        self.stop_cleanup = True
        if self.cleanup_thread:
            self.cleanup_thread.join(timeout=1.0)
        
        # 关闭线程池
        self.executor.shutdown(wait=True)
        
        # 清空缓存
        self.clear_cache()

class SubtitleManager:
    """字幕管理器"""
    
    def __init__(self):
        self.tracks: List[SubtitleTrack] = []
        self.style_manager = TextStyleManager()
        self.renderer = SubtitleRenderer()
        
        # 默认轨道
        self.default_track = SubtitleTrack("默认字幕轨道")
        self.tracks.append(self.default_track)
        
        # 配置
        self.auto_save = True
        self.auto_save_interval = 30.0  # 秒
        self.enable_undo_redo = True
        self.max_undo_steps = 50
        
        # 撤销/重做栈
        self.undo_stack: List[Dict[str, Any]] = []
        self.redo_stack: List[Dict[str, Any]] = []
        
        # 事件监听器
        self.event_listeners: Dict[str, List[callable]] = defaultdict(list)
        
        # 自动保存线程
        self.auto_save_thread = None
        self.stop_auto_save = False
        self._start_auto_save()
    
    def _start_auto_save(self) -> None:
        """启动自动保存"""
        if not self.auto_save:
            return
        
        def auto_save_worker():
            while not self.stop_auto_save:
                time.sleep(self.auto_save_interval)
                if self.auto_save:
                    self._auto_save()
        
        self.auto_save_thread = threading.Thread(target=auto_save_worker, daemon=True)
        self.auto_save_thread.start()
    
    def _auto_save(self) -> None:
        """自动保存"""
        try:
            # 这里可以实现自动保存逻辑
            print("自动保存字幕数据...")
        except Exception as e:
            print(f"自动保存失败: {e}")
    
    def add_track(self, track: Optional[SubtitleTrack] = None) -> SubtitleTrack:
        """添加轨道"""
        if track is None:
            track = SubtitleTrack(f"字幕轨道 {len(self.tracks) + 1}")
        
        self.tracks.append(track)
        self._emit_event("track_added", {"track": track})
        
        return track
    
    def remove_track(self, track_id: str) -> bool:
        """移除轨道"""
        track = self.get_track(track_id)
        if not track or len(self.tracks) <= 1:
            return False
        
        self.tracks.remove(track)
        self._emit_event("track_removed", {"track_id": track_id})
        
        return True
    
    def get_track(self, track_id: str) -> Optional[SubtitleTrack]:
        """获取轨道"""
        for track in self.tracks:
            if track.track_id == track_id:
                return track
        return None
    
    def add_text_segment(self, content: str, start_time: float, end_time: float,
                        track_id: Optional[str] = None, **kwargs) -> Optional[str]:
        """添加文本片段"""
        track = self.get_track(track_id) if track_id else self.default_track
        if not track:
            return None
        
        segment = TextSegment(
            content=content,
            start_time=start_time,
            end_time=end_time,
            **kwargs
        )
        
        if track.add_segment(segment):
            self._emit_event("segment_added", {"segment": segment, "track": track})
            return segment.id
        
        return None
    
    def render_frame(self, frame: np.ndarray, current_time: float,
                    track_ids: Optional[List[str]] = None) -> np.ndarray:
        """渲染帧"""
        if track_ids:
            tracks = [self.get_track(tid) for tid in track_ids if self.get_track(tid)]
        else:
            tracks = [track for track in self.tracks if track.visible]
        
        # 收集所有激活的片段
        all_segments = []
        for track in tracks:
            if not track.muted:
                segments = track.get_active_segments(current_time)
                all_segments.extend(segments)
        
        # 渲染字幕
        if all_segments:
            return self.renderer.render_to_frame(frame, all_segments, current_time)
        
        return frame
    
    def add_event_listener(self, event_name: str, callback: callable) -> None:
        """添加事件监听器"""
        self.event_listeners[event_name].append(callback)
    
    def remove_event_listener(self, event_name: str, callback: callable) -> bool:
        """移除事件监听器"""
        if event_name in self.event_listeners:
            if callback in self.event_listeners[event_name]:
                self.event_listeners[event_name].remove(callback)
                return True
        return False
    
    def _emit_event(self, event_name: str, data: Dict[str, Any]) -> None:
        """触发事件"""
        for callback in self.event_listeners.get(event_name, []):
            try:
                callback(data)
            except Exception as e:
                print(f"事件处理错误: {e}")
    
    def save_project(self, file_path: str) -> bool:
        """保存项目"""
        try:
            import json
            
            project_data = {
                "version": "1.0",
                "tracks": [track.to_dict() for track in self.tracks],
                "styles": {
                    name: style.to_dict() 
                    for name, style in self.style_manager.styles.items()
                    if not style.is_preset
                },
                "settings": {
                    "auto_save": self.auto_save,
                    "auto_save_interval": self.auto_save_interval,
                    "enable_undo_redo": self.enable_undo_redo,
                    "max_undo_steps": self.max_undo_steps
                }
            }
            
            with open(file_path, 'w', encoding='utf-8') as f:
                json.dump(project_data, f, indent=2, ensure_ascii=False)
            
            return True
        except Exception as e:
            print(f"保存项目失败: {e}")
            return False
    
    def load_project(self, file_path: str) -> bool:
        """加载项目"""
        try:
            import json
            
            with open(file_path, 'r', encoding='utf-8') as f:
                project_data = json.load(f)
            
            # 清空现有数据
            self.tracks.clear()
            self.style_manager.styles.clear()
            
            # 加载轨道
            for track_data in project_data.get("tracks", []):
                track = SubtitleTrack.from_dict(track_data)
                self.tracks.append(track)
            
            # 加载样式
            for name, style_data in project_data.get("styles", {}).items():
                style = TextStyle.from_dict(style_data)
                self.style_manager.add_style(style)
            
            # 加载设置
            settings = project_data.get("settings", {})
            self.auto_save = settings.get("auto_save", True)
            self.auto_save_interval = settings.get("auto_save_interval", 30.0)
            self.enable_undo_redo = settings.get("enable_undo_redo", True)
            self.max_undo_steps = settings.get("max_undo_steps", 50)
            
            return True
        except Exception as e:
            print(f"加载项目失败: {e}")
            return False
    
    def shutdown(self) -> None:
        """关闭管理器"""
        self.stop_auto_save = True
        if self.auto_save_thread:
            self.auto_save_thread.join(timeout=1.0)
        
        # 关闭渲染器
        self.renderer.shutdown()

# 使用示例
if __name__ == "__main__":
    # 创建字幕管理器
    manager = SubtitleManager()
    
    # 添加字幕
    segment_id = manager.add_text_segment(
        content="欢迎使用剪映小助手!",
        start_time=0.0,
        end_time=5.0,
        font_size=32,
        text_color="#FFFFFF",
        stroke_width=2,
        stroke_color="#000000"
    )
    
    # 应用预设样式
    style = manager.style_manager.get_style("经典白字")
    if style:
        segment = manager.default_track.get_segment(segment_id)
        if segment:
            style.apply_to_segment(segment)
    
    # 渲染测试帧
    test_frame = np.zeros((1080, 1920, 3), dtype=np.uint8)
    rendered_frame = manager.render_frame(test_frame, 2.0)
    
    print("字幕系统初始化完成")
    print(f"轨道数量: {len(manager.tracks)}")
    print(f"样式数量: {len(manager.style_manager.styles)}")
    print(f"渲染统计: {manager.renderer.get_render_statistics()}")

11.5 字幕系统总结

11.5.1 核心组件回顾

文本与字幕系统由以下核心组件构成:

TextSegment(文本片段)

  • 管理和渲染单个文本单元
  • 支持丰富的样式属性(字体、颜色、效果、动画)
  • 提供缓存机制优化渲染性能
  • 支持多种对齐方式和文本方向

TextStyle(文本样式)

  • 封装文本的外观属性
  • 支持预设样式和自定义样式
  • 提供样式导入导出功能
  • 支持样式继承和复用

SubtitleTrack(字幕轨道)

  • 管理时间轴上的文本片段集合
  • 提供高效的片段查找和操作
  • 支持片段重叠检测和解决
  • 提供多种片段编辑功能

SubtitleRenderer(字幕渲染器)

  • 高性能的字幕渲染引擎
  • 支持多线程渲染和缓存优化
  • 提供详细的性能统计
  • 支持多种渲染模式和效果

SubtitleManager(字幕管理器)

  • 统一管理所有字幕功能
  • 提供项目保存和加载
  • 支持事件系统和自动保存
  • 提供完整的API接口

11.5.2 技术特点

高性能设计

  • 多层次的缓存机制(渲染缓存、字体缓存)
  • 多线程渲染支持
  • 智能的缓存清理策略
  • 索引优化的数据查找

模块化架构

  • 清晰的组件分离
  • 插件化的扩展机制
  • 统一的接口设计
  • 松耦合的依赖关系

丰富的功能

  • 多语言和复杂文本支持
  • 丰富的视觉效果(描边、阴影、渐变)
  • 多种动画效果
  • 灵活的样式系统

可靠性保障

  • 完善的错误处理
  • 数据验证和边界检查
  • 自动保存和恢复
  • 撤销重做支持

11.5.3 应用场景

视频制作

  • 电影和电视剧字幕制作
  • 短视频内容创作
  • 教育视频制作
  • 企业宣传视频

直播和实时应用

  • 实时字幕生成
  • 弹幕系统
  • 直播互动文字
  • 实时翻译字幕

后期处理

  • 批量字幕处理
  • 字幕格式转换
  • 字幕样式统一
  • 多语言字幕管理

这个文本与字幕系统为剪映小助手提供了强大的文字处理能力,支持从简单的字幕添加到复杂的动画效果制作,满足了现代视频制作的各种需求。通过模块化的设计和丰富的API接口,系统既保证了功能的完整性,又提供了良好的扩展性和维护性。