27-项目整合和LangChain4J最佳实践总结📋 学习目标掌握生产级LangChain4J项目架构设计理解安全

时间：45分钟 | 难度：⭐⭐⭐⭐ | Week 4 Day 27

📋 学习目标

掌握生产级LangChain4J项目架构设计
理解安全最佳实践和API密钥管理
学会完整的测试策略（单元测试、集成测试、E2E测试）
掌握Docker和Kubernetes部署方案
熟悉50条生产环境最佳实践
理解从Demo到Production的迁移路径
建立生产就绪检查清单（Production Readiness Checklist）

🏗️ 生产级项目架构

完整项目结构

langchain4j-production/
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── com/
│   │   │       └── example/
│   │   │           └── ai/
│   │   │               ├── config/              # 配置层
│   │   │               │   ├── LangChainConfig.java
│   │   │               │   ├── VectorStoreConfig.java
│   │   │               │   ├── SecurityConfig.java
│   │   │               │   └── MonitoringConfig.java
│   │   │               ├── domain/              # 领域层
│   │   │               │   ├── model/
│   │   │               │   │   ├── ChatMessage.java
│   │   │               │   │   ├── Document.java
│   │   │               │   │   └── SearchResult.java
│   │   │               │   ├── service/
│   │   │               │   │   ├── ChatService.java
│   │   │               │   │   ├── EmbeddingService.java
│   │   │               │   │   └── DocumentService.java
│   │   │               │   └── repository/
│   │   │               │       ├── VectorStoreRepository.java
│   │   │               │       └── ConversationRepository.java
│   │   │               ├── infrastructure/      # 基础设施层
│   │   │               │   ├── ai/
│   │   │               │   │   ├── LLMClient.java
│   │   │               │   │   ├── EmbeddingClient.java
│   │   │               │   │   └── VectorStoreClient.java
│   │   │               │   ├── cache/
│   │   │               │   │   ├── CacheManager.java
│   │   │               │   │   └── EmbeddingCache.java
│   │   │               │   ├── security/
│   │   │               │   │   ├── InputValidator.java
│   │   │               │   │   ├── OutputSanitizer.java
│   │   │               │   │   └── RateLimiter.java
│   │   │               │   └── monitoring/
│   │   │               │       ├── MetricsCollector.java
│   │   │               │       └── HealthIndicator.java
│   │   │               ├── api/                 # API层
│   │   │               │   ├── controller/
│   │   │               │   │   ├── ChatController.java
│   │   │               │   │   ├── DocumentController.java
│   │   │               │   │   └── HealthController.java
│   │   │               │   ├── dto/
│   │   │               │   │   ├── ChatRequest.java
│   │   │               │   │   ├── ChatResponse.java
│   │   │               │   │   └── ErrorResponse.java
│   │   │               │   └── exception/
│   │   │               │       ├── GlobalExceptionHandler.java
│   │   │               │       └── RateLimitException.java
│   │   │               └── Application.java
│   │   └── resources/
│   │       ├── application.yml
│   │       ├── application-dev.yml
│   │       ├── application-prod.yml
│   │       ├── logback-spring.xml
│   │       └── db/
│   │           └── migration/
│   │               └── V1__init_schema.sql
│   └── test/
│       ├── java/
│       │   └── com/
│       │       └── example/
│       │           └── ai/
│       │               ├── integration/         # 集成测试
│       │               │   ├── ChatServiceIntegrationTest.java
│       │               │   └── VectorStoreIntegrationTest.java
│       │               ├── e2e/                 # E2E测试
│       │               │   └── ChatFlowE2ETest.java
│       │               └── unit/                # 单元测试
│       │                   ├── ChatServiceTest.java
│       │                   ├── RateLimiterTest.java
│       │                   └── InputValidatorTest.java
│       └── resources/
│           ├── application-test.yml
│           └── test-data/
├── docker/
│   ├── Dockerfile
│   ├── docker-compose.yml
│   └── init-scripts/
├── k8s/
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── configmap.yaml
│   ├── secret.yaml
│   └── ingress.yaml
├── scripts/
│   ├── deploy.sh
│   ├── rollback.sh
│   └── health-check.sh
├── docs/
│   ├── API.md
│   ├── DEPLOYMENT.md
│   └── TROUBLESHOOTING.md
├── build.gradle.kts
├── settings.gradle.kts
├── .env.example
├── .gitignore
└── README.md

依赖管理（build.gradle.kts）

plugins {
    id("org.springframework.boot") version "3.2.0"
    id("io.spring.dependency-management") version "1.1.4"
    kotlin("jvm") version "1.9.20"
    kotlin("plugin.spring") version "1.9.20"
}

group = "com.example.ai"
version = "1.0.0"
java.sourceCompatibility = JavaVersion.VERSION_21

repositories {
    mavenCentral()
}

dependencies {
    // Spring Boot核心
    implementation("org.springframework.boot:spring-boot-starter-web")
    implementation("org.springframework.boot:spring-boot-starter-actuator")
    implementation("org.springframework.boot:spring-boot-starter-validation")
    implementation("org.springframework.boot:spring-boot-starter-cache")
    implementation("org.springframework.boot:spring-boot-starter-data-jpa")

    // LangChain4J核心
    implementation("dev.langchain4j:langchain4j:0.36.2")
    implementation("dev.langchain4j:langchain4j-spring-boot-starter:0.36.2")
    implementation("dev.langchain4j:langchain4j-open-ai:0.36.2")
    implementation("dev.langchain4j:langchain4j-embeddings-all-minilm-l6-v2:0.36.2")

    // 向量存储
    implementation("dev.langchain4j:langchain4j-pgvector:0.36.2")
    implementation("dev.langchain4j:langchain4j-qdrant:0.36.2")

    // 缓存
    implementation("com.github.ben-manes.caffeine:caffeine:3.1.8")

    // 监控和指标
    implementation("io.micrometer:micrometer-registry-prometheus")
    implementation("io.micrometer:micrometer-tracing-bridge-brave")

    // 安全
    implementation("org.springframework.boot:spring-boot-starter-security")
    implementation("io.jsonwebtoken:jjwt-api:0.12.3")
    runtimeOnly("io.jsonwebtoken:jjwt-impl:0.12.3")
    runtimeOnly("io.jsonwebtoken:jjwt-jackson:0.12.3")

    // 限流
    implementation("com.bucket4j:bucket4j-core:8.7.0")
    implementation("com.bucket4j:bucket4j-redis:8.7.0")

    // 数据库
    implementation("org.postgresql:postgresql")
    implementation("org.flywaydb:flyway-core")

    // 工具
    implementation("org.projectlombok:lombok")
    annotationProcessor("org.projectlombok:lombok")

    // 测试
    testImplementation("org.springframework.boot:spring-boot-starter-test")
    testImplementation("org.springframework.security:spring-security-test")
    testImplementation("org.testcontainers:testcontainers:1.19.3")
    testImplementation("org.testcontainers:postgresql:1.19.3")
    testImplementation("org.testcontainers:junit-jupiter:1.19.3")
    testImplementation("org.mockito:mockito-core")
    testImplementation("org.mockito:mockito-junit-jupiter")
    testImplementation("com.github.tomakehurst:wiremock-jre8:2.35.0")
}

tasks.withType<Test> {
    useJUnitPlatform()
}

核心配置类

@Configuration
@EnableConfigurationProperties(LangChainProperties.class)
public class LangChainConfig {

    private final LangChainProperties properties;

    public LangChainConfig(LangChainProperties properties) {
        this.properties = properties;
    }

    @Bean
    public ChatLanguageModel chatLanguageModel() {
        return OpenAiChatModel.builder()
            .apiKey(properties.getApiKey())
            .modelName(properties.getModelName())
            .temperature(properties.getTemperature())
            .timeout(Duration.ofSeconds(properties.getTimeout()))
            .maxRetries(properties.getMaxRetries())
            .logRequests(properties.isLogRequests())
            .logResponses(properties.isLogResponses())
            .build();
    }

    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(properties.getApiKey())
            .modelName(properties.getEmbeddingModelName())
            .timeout(Duration.ofSeconds(properties.getTimeout()))
            .build();
    }

    @Bean
    public EmbeddingStore<TextSegment> embeddingStore(DataSource dataSource) {
        return PgVectorEmbeddingStore.builder()
            .dataSource(dataSource)
            .table(properties.getVectorStore().getTable())
            .dimension(properties.getVectorStore().getDimension())
            .build();
    }

    @Bean
    public ContentRetriever contentRetriever(
            EmbeddingStore<TextSegment> embeddingStore,
            EmbeddingModel embeddingModel) {
        return EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .maxResults(properties.getRetrieval().getMaxResults())
            .minScore(properties.getRetrieval().getMinScore())
            .build();
    }
}

🔒 安全最佳实践

1. API密钥管理

环境变量方式（开发环境）

# application.yml
langchain4j:
  api-key: ${OPENAI_API_KEY}
  model-name: ${OPENAI_MODEL_NAME:gpt-4}

spring:
  datasource:
    url: ${DATABASE_URL}
    username: ${DATABASE_USERNAME}
    password: ${DATABASE_PASSWORD}

# .env文件（不要提交到Git）
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxx
OPENAI_MODEL_NAME=gpt-4
DATABASE_URL=jdbc:postgresql://localhost:5432/langchain4j
DATABASE_USERNAME=postgres
DATABASE_PASSWORD=secret

Vault集成（生产环境）

@Configuration
public class VaultConfig {

    @Bean
    public VaultTemplate vaultTemplate() {
        VaultEndpoint endpoint = VaultEndpoint.create("vault.example.com", 8200);

        VaultToken token = VaultToken.of(System.getenv("VAULT_TOKEN"));

        return new VaultTemplate(
            endpoint,
            new TokenAuthentication(token)
        );
    }

    @Bean
    public String openAiApiKey(VaultTemplate vaultTemplate) {
        VaultResponse response = vaultTemplate
            .read("secret/data/langchain4j/openai");

        return (String) response
            .getRequiredData()
            .get("api-key");
    }
}

2. 输入验证

@Component
public class InputValidator {

    private static final int MAX_INPUT_LENGTH = 4000;
    private static final Pattern INJECTION_PATTERN =
        Pattern.compile("(DROP|DELETE|UPDATE|INSERT|EXEC|SCRIPT)",
                       Pattern.CASE_INSENSITIVE);

    public void validateChatInput(String input) {
        // 检查空值
        if (input == null || input.trim().isEmpty()) {
            throw new ValidationException("输入不能为空");
        }

        // 检查长度
        if (input.length() > MAX_INPUT_LENGTH) {
            throw new ValidationException(
                "输入超过最大长度限制: " + MAX_INPUT_LENGTH
            );
        }

        // 检查注入攻击
        if (INJECTION_PATTERN.matcher(input).find()) {
            throw new SecurityException("检测到潜在的注入攻击");
        }

        // 检查特殊字符
        if (containsMaliciousCharacters(input)) {
            throw new SecurityException("输入包含不允许的特殊字符");
        }
    }

    private boolean containsMaliciousCharacters(String input) {
        // 检查控制字符、零宽字符等
        return input.codePoints().anyMatch(cp ->
            Character.isISOControl(cp) && cp != '\n' && cp != '\r' && cp != '\t'
        );
    }

    public void validateDocumentUpload(MultipartFile file) {
        // 检查文件大小
        if (file.getSize() > 10 * 1024 * 1024) { // 10MB
            throw new ValidationException("文件大小超过10MB限制");
        }

        // 检查文件类型
        String contentType = file.getContentType();
        List<String> allowedTypes = Arrays.asList(
            "application/pdf",
            "text/plain",
            "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
        );

        if (!allowedTypes.contains(contentType)) {
            throw new ValidationException("不支持的文件类型: " + contentType);
        }

        // 检查文件名
        String filename = file.getOriginalFilename();
        if (filename == null || filename.contains("..")) {
            throw new SecurityException("非法的文件名");
        }
    }
}

3. 输出过滤

@Component
public class OutputSanitizer {

    private static final Pattern PII_PATTERN = Pattern.compile(
        // 匹配邮箱
        "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b|" +
        // 匹配手机号（中国）
        "\\b1[3-9]\\d{9}\\b|" +
        // 匹配身份证号
        "\\b\\d{17}[\\dXx]\\b"
    );

    private static final Pattern API_KEY_PATTERN = Pattern.compile(
        "(?i)(api[_-]?key|secret|token|password)\\s*[:=]\\s*['\"]?([\\w\\-]+)['\"]?"
    );

    public String sanitizeOutput(String output) {
        if (output == null) {
            return null;
        }

        String sanitized = output;

        // 移除PII信息
        sanitized = removePII(sanitized);

        // 移除API密钥
        sanitized = removeApiKeys(sanitized);

        // HTML转义（防止XSS）
        sanitized = HtmlUtils.htmlEscape(sanitized);

        return sanitized;
    }

    private String removePII(String text) {
        Matcher matcher = PII_PATTERN.matcher(text);
        StringBuffer result = new StringBuffer();

        while (matcher.find()) {
            matcher.appendReplacement(result, "[已隐藏]");
        }
        matcher.appendTail(result);

        return result.toString();
    }

    private String removeApiKeys(String text) {
        Matcher matcher = API_KEY_PATTERN.matcher(text);
        StringBuffer result = new StringBuffer();

        while (matcher.find()) {
            String key = matcher.group(1);
            matcher.appendReplacement(result, key + ": [REDACTED]");
        }
        matcher.appendTail(result);

        return result.toString();
    }
}

4. 基于用户的限流

@Component
public class UserRateLimiter {

    private final Map<String, Bucket> buckets = new ConcurrentHashMap<>();
    private final RateLimitConfig config;

    public UserRateLimiter(RateLimitConfig config) {
        this.config = config;
    }

    public boolean allowRequest(String userId) {
        Bucket bucket = buckets.computeIfAbsent(userId, this::createBucket);
        return bucket.tryConsume(1);
    }

    private Bucket createBucket(String userId) {
        // 不同用户级别不同的限流策略
        UserTier tier = getUserTier(userId);

        Bandwidth limit = switch (tier) {
            case FREE -> Bandwidth.builder()
                .capacity(10)  // 10次请求
                .refillGreedy(10, Duration.ofMinutes(1))  // 每分钟
                .build();
            case PRO -> Bandwidth.builder()
                .capacity(100)
                .refillGreedy(100, Duration.ofMinutes(1))
                .build();
            case ENTERPRISE -> Bandwidth.builder()
                .capacity(1000)
                .refillGreedy(1000, Duration.ofMinutes(1))
                .build();
        };

        return Bucket.builder()
            .addLimit(limit)
            .build();
    }

    private UserTier getUserTier(String userId) {
        // 从数据库或缓存中获取用户级别
        return config.getUserTier(userId);
    }

    // 定期清理不活跃的bucket
    @Scheduled(fixedRate = 3600000) // 每小时
    public void cleanup() {
        buckets.entrySet().removeIf(entry ->
            entry.getValue().getAvailableTokens() ==
            entry.getValue().getAvailableTokens() // 简化示例
        );
    }
}

@RestControllerAdvice
public class RateLimitInterceptor {

    private final UserRateLimiter rateLimiter;

    @Around("@annotation(RateLimited)")
    public Object checkRateLimit(ProceedingJoinPoint joinPoint) throws Throwable {
        String userId = SecurityContextHolder.getContext()
            .getAuthentication()
            .getName();

        if (!rateLimiter.allowRequest(userId)) {
            throw new RateLimitExceededException(
                "请求频率超限，请稍后再试"
            );
        }

        return joinPoint.proceed();
    }
}

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface RateLimited {
}

🧪 测试策略

1. 单元测试（Mock LLM）

@ExtendWith(MockitoExtension.class)
class ChatServiceTest {

    @Mock
    private ChatLanguageModel chatModel;

    @Mock
    private InputValidator inputValidator;

    @Mock
    private OutputSanitizer outputSanitizer;

    @InjectMocks
    private ChatService chatService;

    @Test
    @DisplayName("应该成功处理有效的聊天请求")
    void shouldProcessValidChatRequest() {
        // Given
        String userMessage = "什么是LangChain4J？";
        String expectedResponse = "LangChain4J是一个Java的AI框架...";

        when(chatModel.generate(userMessage))
            .thenReturn(Response.from(expectedResponse));
        when(outputSanitizer.sanitizeOutput(expectedResponse))
            .thenReturn(expectedResponse);

        // When
        String result = chatService.chat(userMessage);

        // Then
        assertEquals(expectedResponse, result);
        verify(inputValidator).validateChatInput(userMessage);
        verify(chatModel).generate(userMessage);
        verify(outputSanitizer).sanitizeOutput(expectedResponse);
    }

    @Test
    @DisplayName("应该拒绝过长的输入")
    void shouldRejectTooLongInput() {
        // Given
        String longMessage = "x".repeat(5000);
        doThrow(new ValidationException("输入过长"))
            .when(inputValidator).validateChatInput(longMessage);

        // When & Then
        assertThrows(ValidationException.class, () -> {
            chatService.chat(longMessage);
        });

        verify(chatModel, never()).generate(any());
    }

    @Test
    @DisplayName("应该处理LLM超时")
    void shouldHandleLLMTimeout() {
        // Given
        String message = "测试消息";
        when(chatModel.generate(message))
            .thenThrow(new RuntimeException("Timeout"));

        // When & Then
        assertThrows(ChatServiceException.class, () -> {
            chatService.chat(message);
        });
    }

    @Test
    @DisplayName("应该过滤输出中的敏感信息")
    void shouldFilterSensitiveInfoInOutput() {
        // Given
        String message = "我的联系方式";
        String rawResponse = "您的邮箱是test@example.com，手机号是13812345678";
        String sanitizedResponse = "您的邮箱是[已隐藏]，手机号是[已隐藏]";

        when(chatModel.generate(message))
            .thenReturn(Response.from(rawResponse));
        when(outputSanitizer.sanitizeOutput(rawResponse))
            .thenReturn(sanitizedResponse);

        // When
        String result = chatService.chat(message);

        // Then
        assertEquals(sanitizedResponse, result);
        assertFalse(result.contains("test@example.com"));
        assertFalse(result.contains("13812345678"));
    }
}

2. 集成测试（真实LLM）

@SpringBootTest
@Testcontainers
@ActiveProfiles("test")
class ChatServiceIntegrationTest {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
        .withDatabaseName("testdb")
        .withUsername("test")
        .withPassword("test");

    @DynamicPropertySource
    static void configureProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);
    }

    @Autowired
    private ChatService chatService;

    @Autowired
    private ConversationRepository conversationRepository;

    @Test
    @DisplayName("端到端聊天流程应该正常工作")
    void endToEndChatFlowShouldWork() {
        // Given
        String conversationId = UUID.randomUUID().toString();
        String message = "什么是向量数据库？";

        // When
        ChatResponse response = chatService.chat(conversationId, message);

        // Then
        assertNotNull(response);
        assertNotNull(response.getMessage());
        assertTrue(response.getMessage().length() > 0);

        // 验证会话被保存
        Optional<Conversation> saved = conversationRepository.findById(conversationId);
        assertTrue(saved.isPresent());
        assertEquals(2, saved.get().getMessages().size()); // 用户消息 + AI回复
    }

    @Test
    @DisplayName("应该维护对话上下文")
    void shouldMaintainConversationContext() {
        // Given
        String conversationId = UUID.randomUUID().toString();

        // When - 第一轮对话
        chatService.chat(conversationId, "我的名字是张三");

        // When - 第二轮对话
        ChatResponse response = chatService.chat(conversationId, "我叫什么名字？");

        // Then
        assertTrue(
            response.getMessage().contains("张三"),
            "AI应该记住之前对话中的名字"
        );
    }

    @Test
    @DisplayName("RAG流程应该检索相关文档")
    void ragFlowShouldRetrieveRelevantDocuments() {
        // Given - 先添加一些文档
        documentService.addDocument(
            "LangChain4J是一个用于构建AI应用的Java框架"
        );
        documentService.addDocument(
            "向量数据库用于存储和检索嵌入向量"
        );

        // When
        ChatResponse response = chatService.chatWithRAG("什么是LangChain4J？");

        // Then
        assertNotNull(response);
        assertNotNull(response.getSources());
        assertFalse(response.getSources().isEmpty());
        assertTrue(
            response.getMessage().contains("Java框架"),
            "回答应该基于检索到的文档"
        );
    }
}

3. E2E测试（TestContainers）

@SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT)
@Testcontainers
@AutoConfigureMockMvc
class ChatFlowE2ETest {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
        .withDatabaseName("testdb");

    @Container
    static GenericContainer<?> qdrant = new GenericContainer<>("qdrant/qdrant:latest")
        .withExposedPorts(6333)
        .waitingFor(Wait.forHttp("/health").forStatusCode(200));

    @Autowired
    private MockMvc mockMvc;

    @Autowired
    private ObjectMapper objectMapper;

    @DynamicPropertySource
    static void configureProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);

        registry.add("langchain4j.qdrant.host", qdrant::getHost);
        registry.add("langchain4j.qdrant.port", qdrant::getFirstMappedPort);
    }

    @Test
    @DisplayName("完整的RAG流程E2E测试")
    void completeRAGFlowE2E() throws Exception {
        // Step 1: 上传文档
        MockMultipartFile file = new MockMultipartFile(
            "file",
            "test.txt",
            "text/plain",
            "LangChain4J是一个Java AI框架".getBytes()
        );

        mockMvc.perform(multipart("/api/documents")
                .file(file))
            .andExpect(status().isOk())
            .andExpect(jsonPath("$.documentId").exists());

        // Step 2: 等待文档被索引
        Thread.sleep(2000);

        // Step 3: 发送查询
        ChatRequest request = new ChatRequest();
        request.setMessage("什么是LangChain4J？");
        request.setUseRAG(true);

        MvcResult result = mockMvc.perform(post("/api/chat")
                .contentType(MediaType.APPLICATION_JSON)
                .content(objectMapper.writeValueAsString(request)))
            .andExpect(status().isOk())
            .andExpect(jsonPath("$.message").exists())
            .andExpect(jsonPath("$.sources").isArray())
            .andReturn();

        // Step 4: 验证响应
        ChatResponse response = objectMapper.readValue(
            result.getResponse().getContentAsString(),
            ChatResponse.class
        );

        assertTrue(response.getSources().size() > 0);
        assertTrue(response.getMessage().contains("Java"));
    }

    @Test
    @DisplayName("限流机制E2E测试")
    void rateLimitingE2E() throws Exception {
        ChatRequest request = new ChatRequest();
        request.setMessage("测试消息");

        // 发送多个请求直到触发限流
        for (int i = 0; i < 15; i++) {
            ResultActions result = mockMvc.perform(post("/api/chat")
                .contentType(MediaType.APPLICATION_JSON)
                .content(objectMapper.writeValueAsString(request)));

            if (i < 10) {
                result.andExpect(status().isOk());
            } else {
                result.andExpect(status().isTooManyRequests());
            }
        }
    }

    @Test
    @DisplayName("健康检查E2E测试")
    void healthCheckE2E() throws Exception {
        mockMvc.perform(get("/actuator/health"))
            .andExpect(status().isOk())
            .andExpect(jsonPath("$.status").value("UP"))
            .andExpect(jsonPath("$.components.db.status").value("UP"))
            .andExpect(jsonPath("$.components.vectorStore.status").value("UP"));
    }
}

🐳 部署和运维

1. Dockerfile

# 多阶段构建
FROM gradle:8.5-jdk21 AS builder

WORKDIR /app

# 复制构建文件
COPY build.gradle.kts settings.gradle.kts ./
COPY src ./src

# 构建应用
RUN gradle clean build -x test --no-daemon

# 运行时镜像
FROM eclipse-temurin:21-jre-alpine

# 添加非root用户
RUN addgroup -S spring && adduser -S spring -G spring

WORKDIR /app

# 复制构建产物
COPY --from=builder /app/build/libs/*.jar app.jar

# 修改所有者
RUN chown -R spring:spring /app

# 切换到非root用户
USER spring:spring

# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=60s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8080/actuator/health || exit 1

# 暴露端口
EXPOSE 8080

# JVM参数优化
ENV JAVA_OPTS="-XX:+UseContainerSupport \
               -XX:MaxRAMPercentage=75.0 \
               -XX:+UseG1GC \
               -XX:+ExitOnOutOfMemoryError \
               -Djava.security.egd=file:/dev/./urandom"

# 启动应用
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]

2. docker-compose.yml

version: '3.8'

services:
  app:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SPRING_PROFILES_ACTIVE=prod
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - DATABASE_URL=jdbc:postgresql://postgres:5432/langchain4j
      - DATABASE_USERNAME=langchain4j
      - DATABASE_PASSWORD=${DB_PASSWORD}
      - QDRANT_HOST=qdrant
      - QDRANT_PORT=6333
    depends_on:
      postgres:
        condition: service_healthy
      qdrant:
        condition: service_started
    restart: unless-stopped
    networks:
      - langchain4j-network
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G
        reservations:
          cpus: '1'
          memory: 1G

  postgres:
    image: pgvector/pgvector:pg16
    environment:
      - POSTGRES_DB=langchain4j
      - POSTGRES_USER=langchain4j
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U langchain4j"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - langchain4j-network

  qdrant:
    image: qdrant/qdrant:v1.7.4
    ports:
      - "6333:6333"
      - "6334:6334"
    volumes:
      - qdrant-data:/qdrant/storage
    networks:
      - langchain4j-network

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    networks:
      - langchain4j-network

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    volumes:
      - grafana-data:/var/lib/grafana
    networks:
      - langchain4j-network

volumes:
  postgres-data:
  qdrant-data:
  prometheus-data:
  grafana-data:

networks:
  langchain4j-network:
    driver: bridge

3. Kubernetes部署

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: langchain4j-app
  namespace: production
  labels:
    app: langchain4j
    version: v1.0.0
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: langchain4j
  template:
    metadata:
      labels:
        app: langchain4j
        version: v1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/actuator/prometheus"
    spec:
      serviceAccountName: langchain4j-sa

      # 初始化容器 - 等待数据库就绪
      initContainers:
      - name: wait-for-postgres
        image: busybox:1.36
        command:
        - sh
        - -c
        - |
          until nc -z postgres-service 5432; do
            echo "Waiting for PostgreSQL..."
            sleep 2
          done

      containers:
      - name: langchain4j
        image: your-registry.com/langchain4j:v1.0.0
        imagePullPolicy: Always

        ports:
        - name: http
          containerPort: 8080
          protocol: TCP

        env:
        - name: SPRING_PROFILES_ACTIVE
          value: "prod"
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: langchain4j-secrets
              key: openai-api-key
        - name: DATABASE_URL
          valueFrom:
            configMapKeyRef:
              name: langchain4j-config
              key: database-url
        - name: DATABASE_USERNAME
          valueFrom:
            secretKeyRef:
              name: langchain4j-secrets
              key: db-username
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: langchain4j-secrets
              key: db-password

        # 资源限制
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "2000m"
            memory: "2Gi"

        # 存活探针
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 3

        # 就绪探针
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

        # 启动探针
        startupProbe:
          httpGet:
            path: /actuator/health
            port: 8080
          initialDelaySeconds: 0
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 30

        # 优雅关闭
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]

      # 优雅关闭时间
      terminationGracePeriodSeconds: 30

      # Pod反亲和性 - 不同节点部署
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - langchain4j
              topologyKey: kubernetes.io/hostname

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: langchain4j-service
  namespace: production
  labels:
    app: langchain4j
spec:
  type: ClusterIP
  selector:
    app: langchain4j
  ports:
  - name: http
    port: 80
    targetPort: 8080
    protocol: TCP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: langchain4j-config
  namespace: production
data:
  database-url: "jdbc:postgresql://postgres-service:5432/langchain4j"
  qdrant-host: "qdrant-service"
  qdrant-port: "6333"

  application.yml: |
    server:
      port: 8080
      shutdown: graceful

    spring:
      application:
        name: langchain4j-app

      lifecycle:
        timeout-per-shutdown-phase: 20s

    management:
      endpoints:
        web:
          exposure:
            include: health,prometheus,info,metrics
      health:
        livenessState:
          enabled: true
        readinessState:
          enabled: true
      metrics:
        export:
          prometheus:
            enabled: true

    langchain4j:
      model-name: gpt-4
      temperature: 0.7
      max-tokens: 2000
      timeout: 60
      max-retries: 3

secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: langchain4j-secrets
  namespace: production
type: Opaque
stringData:
  openai-api-key: "sk-proj-xxxxxxxxxxxxx"
  db-username: "langchain4j"
  db-password: "your-secure-password"

ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: langchain4j-ingress
  namespace: production
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - api.example.com
    secretName: langchain4j-tls
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: langchain4j-service
            port:
              number: 80

📋 50条最佳实践清单

代码质量（10条）

使用构建器模式：所有LangChain4J对象都使用.builder()模式，提高可读性
接口隔离：为LLM交互定义清晰的接口，方便mock和测试
异常处理分层：业务异常、技术异常、外部服务异常分开处理
日志规范：使用结构化日志，包含请求ID、用户ID、操作类型
配置外部化：所有配置项通过配置文件或环境变量注入
版本控制：API版本化（/v1/chat），向后兼容
代码复用：提取通用的prompt模板、工具函数到共享模块
类型安全：使用强类型而非Map<String, Object>
不可变对象：领域模型使用不可变对象，线程安全
文档注释：复杂的prompt工程、RAG配置需要详细注释

性能优化（10条）

嵌入向量缓存：相同文本的embedding结果缓存到Redis
批量处理：文档处理使用批量API，减少网络往返
异步处理：长时间操作使用异步，返回任务ID
连接池配置：数据库、HTTP客户端使用合理的连接池大小
分页查询：向量检索使用分页，避免一次加载过多数据
懒加载：对话历史按需加载，不是全部加载
压缩传输：启用GZIP压缩，减少网络传输
索引优化：向量数据库索引类型选择（HNSW vs IVF）
热点数据预热：系统启动时预加载常用embedding
超时设置：所有外部调用设置合理超时（LLM 60s，DB 5s）

成本控制（10条）

Token计数：请求前估算token数，避免超出限制
模型选择：根据任务复杂度选择模型（简单任务用GPT-3.5）
Prompt优化：精简system prompt，减少不必要的token
结果缓存：相似问题的答案缓存30分钟
限流分级：不同用户级别不同的配额
批量嵌入：使用batch embedding API，成本更低
向量维度：根据需求选择embedding维度（768 vs 1536）
监控告警：API调用成本超过阈值时告警
降级策略：成本超标时降级到更便宜的模型
定期审计：每周审计API使用情况，优化高成本查询

可靠性（10条）

重试机制：API调用失败自动重试，指数退避
熔断器：连续失败时熔断，保护下游服务
降级方案：LLM不可用时返回预设回答
幂等性：所有写操作支持幂等，防止重复提交
事务管理：向量存储和元数据存储保持一致性
健康检查：检查LLM、数据库、向量存储的健康状态
优雅关闭：接收到SIGTERM时，完成当前请求再关闭
数据备份：向量数据和对话历史定期备份
故障恢复：系统重启后自动恢复未完成的任务
多区域部署：关键服务部署在多个可用区

安全性（10条）

API密钥加密：生产环境使用Vault等密钥管理服务
输入验证：严格验证用户输入，防止注入攻击
输出过滤：过滤敏感信息（PII、API密钥）
访问控制：基于角色的权限控制（RBAC）
审计日志：记录所有敏感操作（创建、删除、修改）
HTTPS强制：生产环境强制使用HTTPS
CORS配置：严格配置跨域请求来源
SQL注入防护：使用参数化查询，不拼接SQL
XSS防护：输出时HTML转义，防止跨站脚本攻击
定期更新：及时更新依赖库，修复安全漏洞

🎯 项目Checklist

生产就绪检查清单

安全性

API密钥不在代码中硬编码
生产环境使用Vault或类似服务管理密钥
启用HTTPS和TLS 1.2+
实施输入验证和输出过滤
配置CORS白名单
启用限流保护
审计日志记录所有敏感操作

可靠性

健康检查端点正常工作
配置重试和熔断机制
实现优雅关闭
数据库连接池配置合理
设置合理的超时时间
配置资源限制（CPU、内存）

监控和可观测性

测试

文档

成本优化

配置结果缓存
根据任务选择合适的模型
设置成本告警
定期审计API使用

合规性

GDPR/隐私保护合规
数据保留策略
用户数据导出/删除功能
服务条款和隐私政策

🚀 从Demo到Production

Demo阶段特征

// Demo代码示例
public class DemoChat {
    public static void main(String[] args) {
        // ❌ 硬编码API密钥
        String apiKey = "sk-proj-xxxxx";

        // ❌ 直接使用，没有错误处理
        ChatLanguageModel model = OpenAiChatModel.builder()
            .apiKey(apiKey)
            .modelName("gpt-4")
            .build();

        // ❌ 没有输入验证
        String response = model.generate("用户输入");

        // ❌ 直接输出，没有过滤
        System.out.println(response);
    }
}

Production改造

// 生产级代码
@Service
@Slf4j
public class ProductionChatService {

    private final ChatLanguageModel model;
    private final InputValidator validator;
    private final OutputSanitizer sanitizer;
    private final MetricsCollector metrics;
    private final CircuitBreaker circuitBreaker;

    // ✅ 依赖注入，配置外部化
    public ProductionChatService(
            ChatLanguageModel model,
            InputValidator validator,
            OutputSanitizer sanitizer,
            MetricsCollector metrics,
            CircuitBreakerRegistry circuitBreakerRegistry) {
        this.model = model;
        this.validator = validator;
        this.sanitizer = sanitizer;
        this.metrics = metrics;
        this.circuitBreaker = circuitBreakerRegistry.circuitBreaker("chat-service");
    }

    @Transactional
    @RateLimited
    public ChatResponse chat(ChatRequest request) {
        // ✅ 参数验证
        validator.validateChatInput(request.getMessage());

        // ✅ 记录指标
        Timer.Sample sample = Timer.start(metrics.getRegistry());

        try {
            // ✅ 使用熔断器保护
            String response = circuitBreaker.executeSupplier(() -> {
                try {
                    return model.generate(request.getMessage());
                } catch (Exception e) {
                    log.error("LLM调用失败", e);
                    throw new ChatServiceException("AI服务暂时不可用", e);
                }
            });

            // ✅ 输出过滤
            String sanitized = sanitizer.sanitizeOutput(response);

            // ✅ 记录成功
            sample.stop(metrics.timer("chat.success"));

            return ChatResponse.builder()
                .message(sanitized)
                .timestamp(Instant.now())
                .build();

        } catch (CallNotPermittedException e) {
            // ✅ 熔断降级
            log.warn("熔断器打开，返回降级响应");
            sample.stop(metrics.timer("chat.circuit_open"));
            return getFallbackResponse();

        } catch (Exception e) {
            // ✅ 错误处理
            log.error("聊天服务异常", e);
            sample.stop(metrics.timer("chat.error"));
            throw new ChatServiceException("处理请求时发生错误", e);
        }
    }

    private ChatResponse getFallbackResponse() {
        return ChatResponse.builder()
            .message("抱歉，AI服务当前繁忙，请稍后重试")
            .isFallback(true)
            .timestamp(Instant.now())
            .build();
    }
}

迁移检查表

1. 配置管理

Demo	Production
硬编码配置	环境变量/配置文件
单一配置	多环境配置（dev/test/prod）
明文密钥	Vault/加密

2. 错误处理

Demo	Production
没有异常处理	完整的try-catch
打印堆栈	结构化日志
应用崩溃	优雅降级

3. 可靠性

Demo	Production
单次调用	重试+熔断
同步阻塞	异步+超时
无监控	完整监控告警

4. 性能

Demo	Production
无缓存	多层缓存
串行处理	批量+异步
无限流	限流+降级

5. 安全

Demo	Production
无验证	输入验证+输出过滤
无认证	JWT/OAuth2
HTTP	HTTPS + 证书

6. 测试

Demo	Production
手工测试	自动化测试套件
无测试	80%+覆盖率
本地验证	CI/CD流水线

💡 实战练习

练习1：构建生产级RAG系统

需求：

支持PDF、Word、TXT文档上传
使用PGVector存储向量
实现缓存和限流
完整的测试覆盖

提示：

先定义接口和测试（TDD）
实现文档解析和向量化
添加缓存层（Caffeine + Redis）
实现限流和监控
编写集成测试

练习2：实现对话上下文管理

需求：

维护用户对话历史
支持会话摘要（长对话）
自动清理过期会话
持久化到数据库

提示：

设计Conversation和Message实体
实现滑动窗口策略
集成ChatMemory
添加定时清理任务

练习3：部署到Kubernetes

需求：

编写完整的K8s配置
配置HPA（水平扩展）
实现滚动更新
配置监控告警

提示：

从Dockerfile开始
编写deployment和service
配置ConfigMap和Secret
使用Helm管理配置
配置Prometheus监控

最后更新：2026-03-09 字数统计：5,500 字 预计阅读时间：45 分钟