27-项目整合和LangChain4J最佳实践总结

3 阅读11分钟

时间:45分钟 | 难度:⭐⭐⭐⭐ | Week 4 Day 27


📋 学习目标

  • 掌握生产级LangChain4J项目架构设计
  • 理解安全最佳实践和API密钥管理
  • 学会完整的测试策略(单元测试、集成测试、E2E测试)
  • 掌握Docker和Kubernetes部署方案
  • 熟悉50条生产环境最佳实践
  • 理解从Demo到Production的迁移路径
  • 建立生产就绪检查清单(Production Readiness Checklist)

🏗️ 生产级项目架构

完整项目结构

langchain4j-production/
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── com/
│   │   │       └── example/
│   │   │           └── ai/
│   │   │               ├── config/              # 配置层
│   │   │               │   ├── LangChainConfig.java
│   │   │               │   ├── VectorStoreConfig.java
│   │   │               │   ├── SecurityConfig.java
│   │   │               │   └── MonitoringConfig.java
│   │   │               ├── domain/              # 领域层
│   │   │               │   ├── model/
│   │   │               │   │   ├── ChatMessage.java
│   │   │               │   │   ├── Document.java
│   │   │               │   │   └── SearchResult.java
│   │   │               │   ├── service/
│   │   │               │   │   ├── ChatService.java
│   │   │               │   │   ├── EmbeddingService.java
│   │   │               │   │   └── DocumentService.java
│   │   │               │   └── repository/
│   │   │               │       ├── VectorStoreRepository.java
│   │   │               │       └── ConversationRepository.java
│   │   │               ├── infrastructure/      # 基础设施层
│   │   │               │   ├── ai/
│   │   │               │   │   ├── LLMClient.java
│   │   │               │   │   ├── EmbeddingClient.java
│   │   │               │   │   └── VectorStoreClient.java
│   │   │               │   ├── cache/
│   │   │               │   │   ├── CacheManager.java
│   │   │               │   │   └── EmbeddingCache.java
│   │   │               │   ├── security/
│   │   │               │   │   ├── InputValidator.java
│   │   │               │   │   ├── OutputSanitizer.java
│   │   │               │   │   └── RateLimiter.java
│   │   │               │   └── monitoring/
│   │   │               │       ├── MetricsCollector.java
│   │   │               │       └── HealthIndicator.java
│   │   │               ├── api/                 # API层
│   │   │               │   ├── controller/
│   │   │               │   │   ├── ChatController.java
│   │   │               │   │   ├── DocumentController.java
│   │   │               │   │   └── HealthController.java
│   │   │               │   ├── dto/
│   │   │               │   │   ├── ChatRequest.java
│   │   │               │   │   ├── ChatResponse.java
│   │   │               │   │   └── ErrorResponse.java
│   │   │               │   └── exception/
│   │   │               │       ├── GlobalExceptionHandler.java
│   │   │               │       └── RateLimitException.java
│   │   │               └── Application.java
│   │   └── resources/
│   │       ├── application.yml
│   │       ├── application-dev.yml
│   │       ├── application-prod.yml
│   │       ├── logback-spring.xml
│   │       └── db/
│   │           └── migration/
│   │               └── V1__init_schema.sql
│   └── test/
│       ├── java/
│       │   └── com/
│       │       └── example/
│       │           └── ai/
│       │               ├── integration/         # 集成测试
│       │               │   ├── ChatServiceIntegrationTest.java
│       │               │   └── VectorStoreIntegrationTest.java
│       │               ├── e2e/                 # E2E测试
│       │               │   └── ChatFlowE2ETest.java
│       │               └── unit/                # 单元测试
│       │                   ├── ChatServiceTest.java
│       │                   ├── RateLimiterTest.java
│       │                   └── InputValidatorTest.java
│       └── resources/
│           ├── application-test.yml
│           └── test-data/
├── docker/
│   ├── Dockerfile
│   ├── docker-compose.yml
│   └── init-scripts/
├── k8s/
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── configmap.yaml
│   ├── secret.yaml
│   └── ingress.yaml
├── scripts/
│   ├── deploy.sh
│   ├── rollback.sh
│   └── health-check.sh
├── docs/
│   ├── API.md
│   ├── DEPLOYMENT.md
│   └── TROUBLESHOOTING.md
├── build.gradle.kts
├── settings.gradle.kts
├── .env.example
├── .gitignore
└── README.md

依赖管理(build.gradle.kts)

plugins {
    id("org.springframework.boot") version "3.2.0"
    id("io.spring.dependency-management") version "1.1.4"
    kotlin("jvm") version "1.9.20"
    kotlin("plugin.spring") version "1.9.20"
}

group = "com.example.ai"
version = "1.0.0"
java.sourceCompatibility = JavaVersion.VERSION_21

repositories {
    mavenCentral()
}

dependencies {
    // Spring Boot核心
    implementation("org.springframework.boot:spring-boot-starter-web")
    implementation("org.springframework.boot:spring-boot-starter-actuator")
    implementation("org.springframework.boot:spring-boot-starter-validation")
    implementation("org.springframework.boot:spring-boot-starter-cache")
    implementation("org.springframework.boot:spring-boot-starter-data-jpa")

    // LangChain4J核心
    implementation("dev.langchain4j:langchain4j:0.36.2")
    implementation("dev.langchain4j:langchain4j-spring-boot-starter:0.36.2")
    implementation("dev.langchain4j:langchain4j-open-ai:0.36.2")
    implementation("dev.langchain4j:langchain4j-embeddings-all-minilm-l6-v2:0.36.2")

    // 向量存储
    implementation("dev.langchain4j:langchain4j-pgvector:0.36.2")
    implementation("dev.langchain4j:langchain4j-qdrant:0.36.2")

    // 缓存
    implementation("com.github.ben-manes.caffeine:caffeine:3.1.8")

    // 监控和指标
    implementation("io.micrometer:micrometer-registry-prometheus")
    implementation("io.micrometer:micrometer-tracing-bridge-brave")

    // 安全
    implementation("org.springframework.boot:spring-boot-starter-security")
    implementation("io.jsonwebtoken:jjwt-api:0.12.3")
    runtimeOnly("io.jsonwebtoken:jjwt-impl:0.12.3")
    runtimeOnly("io.jsonwebtoken:jjwt-jackson:0.12.3")

    // 限流
    implementation("com.bucket4j:bucket4j-core:8.7.0")
    implementation("com.bucket4j:bucket4j-redis:8.7.0")

    // 数据库
    implementation("org.postgresql:postgresql")
    implementation("org.flywaydb:flyway-core")

    // 工具
    implementation("org.projectlombok:lombok")
    annotationProcessor("org.projectlombok:lombok")

    // 测试
    testImplementation("org.springframework.boot:spring-boot-starter-test")
    testImplementation("org.springframework.security:spring-security-test")
    testImplementation("org.testcontainers:testcontainers:1.19.3")
    testImplementation("org.testcontainers:postgresql:1.19.3")
    testImplementation("org.testcontainers:junit-jupiter:1.19.3")
    testImplementation("org.mockito:mockito-core")
    testImplementation("org.mockito:mockito-junit-jupiter")
    testImplementation("com.github.tomakehurst:wiremock-jre8:2.35.0")
}

tasks.withType<Test> {
    useJUnitPlatform()
}

核心配置类

@Configuration
@EnableConfigurationProperties(LangChainProperties.class)
public class LangChainConfig {

    private final LangChainProperties properties;

    public LangChainConfig(LangChainProperties properties) {
        this.properties = properties;
    }

    @Bean
    public ChatLanguageModel chatLanguageModel() {
        return OpenAiChatModel.builder()
            .apiKey(properties.getApiKey())
            .modelName(properties.getModelName())
            .temperature(properties.getTemperature())
            .timeout(Duration.ofSeconds(properties.getTimeout()))
            .maxRetries(properties.getMaxRetries())
            .logRequests(properties.isLogRequests())
            .logResponses(properties.isLogResponses())
            .build();
    }

    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(properties.getApiKey())
            .modelName(properties.getEmbeddingModelName())
            .timeout(Duration.ofSeconds(properties.getTimeout()))
            .build();
    }

    @Bean
    public EmbeddingStore<TextSegment> embeddingStore(DataSource dataSource) {
        return PgVectorEmbeddingStore.builder()
            .dataSource(dataSource)
            .table(properties.getVectorStore().getTable())
            .dimension(properties.getVectorStore().getDimension())
            .build();
    }

    @Bean
    public ContentRetriever contentRetriever(
            EmbeddingStore<TextSegment> embeddingStore,
            EmbeddingModel embeddingModel) {
        return EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .maxResults(properties.getRetrieval().getMaxResults())
            .minScore(properties.getRetrieval().getMinScore())
            .build();
    }
}

🔒 安全最佳实践

1. API密钥管理

环境变量方式(开发环境)

# application.yml
langchain4j:
  api-key: ${OPENAI_API_KEY}
  model-name: ${OPENAI_MODEL_NAME:gpt-4}

spring:
  datasource:
    url: ${DATABASE_URL}
    username: ${DATABASE_USERNAME}
    password: ${DATABASE_PASSWORD}
# .env文件(不要提交到Git)
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxx
OPENAI_MODEL_NAME=gpt-4
DATABASE_URL=jdbc:postgresql://localhost:5432/langchain4j
DATABASE_USERNAME=postgres
DATABASE_PASSWORD=secret

Vault集成(生产环境)

@Configuration
public class VaultConfig {

    @Bean
    public VaultTemplate vaultTemplate() {
        VaultEndpoint endpoint = VaultEndpoint.create("vault.example.com", 8200);

        VaultToken token = VaultToken.of(System.getenv("VAULT_TOKEN"));

        return new VaultTemplate(
            endpoint,
            new TokenAuthentication(token)
        );
    }

    @Bean
    public String openAiApiKey(VaultTemplate vaultTemplate) {
        VaultResponse response = vaultTemplate
            .read("secret/data/langchain4j/openai");

        return (String) response
            .getRequiredData()
            .get("api-key");
    }
}

2. 输入验证

@Component
public class InputValidator {

    private static final int MAX_INPUT_LENGTH = 4000;
    private static final Pattern INJECTION_PATTERN =
        Pattern.compile("(DROP|DELETE|UPDATE|INSERT|EXEC|SCRIPT)",
                       Pattern.CASE_INSENSITIVE);

    public void validateChatInput(String input) {
        // 检查空值
        if (input == null || input.trim().isEmpty()) {
            throw new ValidationException("输入不能为空");
        }

        // 检查长度
        if (input.length() > MAX_INPUT_LENGTH) {
            throw new ValidationException(
                "输入超过最大长度限制: " + MAX_INPUT_LENGTH
            );
        }

        // 检查注入攻击
        if (INJECTION_PATTERN.matcher(input).find()) {
            throw new SecurityException("检测到潜在的注入攻击");
        }

        // 检查特殊字符
        if (containsMaliciousCharacters(input)) {
            throw new SecurityException("输入包含不允许的特殊字符");
        }
    }

    private boolean containsMaliciousCharacters(String input) {
        // 检查控制字符、零宽字符等
        return input.codePoints().anyMatch(cp ->
            Character.isISOControl(cp) && cp != '\n' && cp != '\r' && cp != '\t'
        );
    }

    public void validateDocumentUpload(MultipartFile file) {
        // 检查文件大小
        if (file.getSize() > 10 * 1024 * 1024) { // 10MB
            throw new ValidationException("文件大小超过10MB限制");
        }

        // 检查文件类型
        String contentType = file.getContentType();
        List<String> allowedTypes = Arrays.asList(
            "application/pdf",
            "text/plain",
            "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
        );

        if (!allowedTypes.contains(contentType)) {
            throw new ValidationException("不支持的文件类型: " + contentType);
        }

        // 检查文件名
        String filename = file.getOriginalFilename();
        if (filename == null || filename.contains("..")) {
            throw new SecurityException("非法的文件名");
        }
    }
}

3. 输出过滤

@Component
public class OutputSanitizer {

    private static final Pattern PII_PATTERN = Pattern.compile(
        // 匹配邮箱
        "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b|" +
        // 匹配手机号(中国)
        "\\b1[3-9]\\d{9}\\b|" +
        // 匹配身份证号
        "\\b\\d{17}[\\dXx]\\b"
    );

    private static final Pattern API_KEY_PATTERN = Pattern.compile(
        "(?i)(api[_-]?key|secret|token|password)\\s*[:=]\\s*['\"]?([\\w\\-]+)['\"]?"
    );

    public String sanitizeOutput(String output) {
        if (output == null) {
            return null;
        }

        String sanitized = output;

        // 移除PII信息
        sanitized = removePII(sanitized);

        // 移除API密钥
        sanitized = removeApiKeys(sanitized);

        // HTML转义(防止XSS)
        sanitized = HtmlUtils.htmlEscape(sanitized);

        return sanitized;
    }

    private String removePII(String text) {
        Matcher matcher = PII_PATTERN.matcher(text);
        StringBuffer result = new StringBuffer();

        while (matcher.find()) {
            matcher.appendReplacement(result, "[已隐藏]");
        }
        matcher.appendTail(result);

        return result.toString();
    }

    private String removeApiKeys(String text) {
        Matcher matcher = API_KEY_PATTERN.matcher(text);
        StringBuffer result = new StringBuffer();

        while (matcher.find()) {
            String key = matcher.group(1);
            matcher.appendReplacement(result, key + ": [REDACTED]");
        }
        matcher.appendTail(result);

        return result.toString();
    }
}

4. 基于用户的限流

@Component
public class UserRateLimiter {

    private final Map<String, Bucket> buckets = new ConcurrentHashMap<>();
    private final RateLimitConfig config;

    public UserRateLimiter(RateLimitConfig config) {
        this.config = config;
    }

    public boolean allowRequest(String userId) {
        Bucket bucket = buckets.computeIfAbsent(userId, this::createBucket);
        return bucket.tryConsume(1);
    }

    private Bucket createBucket(String userId) {
        // 不同用户级别不同的限流策略
        UserTier tier = getUserTier(userId);

        Bandwidth limit = switch (tier) {
            case FREE -> Bandwidth.builder()
                .capacity(10)  // 10次请求
                .refillGreedy(10, Duration.ofMinutes(1))  // 每分钟
                .build();
            case PRO -> Bandwidth.builder()
                .capacity(100)
                .refillGreedy(100, Duration.ofMinutes(1))
                .build();
            case ENTERPRISE -> Bandwidth.builder()
                .capacity(1000)
                .refillGreedy(1000, Duration.ofMinutes(1))
                .build();
        };

        return Bucket.builder()
            .addLimit(limit)
            .build();
    }

    private UserTier getUserTier(String userId) {
        // 从数据库或缓存中获取用户级别
        return config.getUserTier(userId);
    }

    // 定期清理不活跃的bucket
    @Scheduled(fixedRate = 3600000) // 每小时
    public void cleanup() {
        buckets.entrySet().removeIf(entry ->
            entry.getValue().getAvailableTokens() ==
            entry.getValue().getAvailableTokens() // 简化示例
        );
    }
}

@RestControllerAdvice
public class RateLimitInterceptor {

    private final UserRateLimiter rateLimiter;

    @Around("@annotation(RateLimited)")
    public Object checkRateLimit(ProceedingJoinPoint joinPoint) throws Throwable {
        String userId = SecurityContextHolder.getContext()
            .getAuthentication()
            .getName();

        if (!rateLimiter.allowRequest(userId)) {
            throw new RateLimitExceededException(
                "请求频率超限,请稍后再试"
            );
        }

        return joinPoint.proceed();
    }
}

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface RateLimited {
}

🧪 测试策略

1. 单元测试(Mock LLM)

@ExtendWith(MockitoExtension.class)
class ChatServiceTest {

    @Mock
    private ChatLanguageModel chatModel;

    @Mock
    private InputValidator inputValidator;

    @Mock
    private OutputSanitizer outputSanitizer;

    @InjectMocks
    private ChatService chatService;

    @Test
    @DisplayName("应该成功处理有效的聊天请求")
    void shouldProcessValidChatRequest() {
        // Given
        String userMessage = "什么是LangChain4J?";
        String expectedResponse = "LangChain4J是一个Java的AI框架...";

        when(chatModel.generate(userMessage))
            .thenReturn(Response.from(expectedResponse));
        when(outputSanitizer.sanitizeOutput(expectedResponse))
            .thenReturn(expectedResponse);

        // When
        String result = chatService.chat(userMessage);

        // Then
        assertEquals(expectedResponse, result);
        verify(inputValidator).validateChatInput(userMessage);
        verify(chatModel).generate(userMessage);
        verify(outputSanitizer).sanitizeOutput(expectedResponse);
    }

    @Test
    @DisplayName("应该拒绝过长的输入")
    void shouldRejectTooLongInput() {
        // Given
        String longMessage = "x".repeat(5000);
        doThrow(new ValidationException("输入过长"))
            .when(inputValidator).validateChatInput(longMessage);

        // When & Then
        assertThrows(ValidationException.class, () -> {
            chatService.chat(longMessage);
        });

        verify(chatModel, never()).generate(any());
    }

    @Test
    @DisplayName("应该处理LLM超时")
    void shouldHandleLLMTimeout() {
        // Given
        String message = "测试消息";
        when(chatModel.generate(message))
            .thenThrow(new RuntimeException("Timeout"));

        // When & Then
        assertThrows(ChatServiceException.class, () -> {
            chatService.chat(message);
        });
    }

    @Test
    @DisplayName("应该过滤输出中的敏感信息")
    void shouldFilterSensitiveInfoInOutput() {
        // Given
        String message = "我的联系方式";
        String rawResponse = "您的邮箱是test@example.com,手机号是13812345678";
        String sanitizedResponse = "您的邮箱是[已隐藏],手机号是[已隐藏]";

        when(chatModel.generate(message))
            .thenReturn(Response.from(rawResponse));
        when(outputSanitizer.sanitizeOutput(rawResponse))
            .thenReturn(sanitizedResponse);

        // When
        String result = chatService.chat(message);

        // Then
        assertEquals(sanitizedResponse, result);
        assertFalse(result.contains("test@example.com"));
        assertFalse(result.contains("13812345678"));
    }
}

2. 集成测试(真实LLM)

@SpringBootTest
@Testcontainers
@ActiveProfiles("test")
class ChatServiceIntegrationTest {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
        .withDatabaseName("testdb")
        .withUsername("test")
        .withPassword("test");

    @DynamicPropertySource
    static void configureProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);
    }

    @Autowired
    private ChatService chatService;

    @Autowired
    private ConversationRepository conversationRepository;

    @Test
    @DisplayName("端到端聊天流程应该正常工作")
    void endToEndChatFlowShouldWork() {
        // Given
        String conversationId = UUID.randomUUID().toString();
        String message = "什么是向量数据库?";

        // When
        ChatResponse response = chatService.chat(conversationId, message);

        // Then
        assertNotNull(response);
        assertNotNull(response.getMessage());
        assertTrue(response.getMessage().length() > 0);

        // 验证会话被保存
        Optional<Conversation> saved = conversationRepository.findById(conversationId);
        assertTrue(saved.isPresent());
        assertEquals(2, saved.get().getMessages().size()); // 用户消息 + AI回复
    }

    @Test
    @DisplayName("应该维护对话上下文")
    void shouldMaintainConversationContext() {
        // Given
        String conversationId = UUID.randomUUID().toString();

        // When - 第一轮对话
        chatService.chat(conversationId, "我的名字是张三");

        // When - 第二轮对话
        ChatResponse response = chatService.chat(conversationId, "我叫什么名字?");

        // Then
        assertTrue(
            response.getMessage().contains("张三"),
            "AI应该记住之前对话中的名字"
        );
    }

    @Test
    @DisplayName("RAG流程应该检索相关文档")
    void ragFlowShouldRetrieveRelevantDocuments() {
        // Given - 先添加一些文档
        documentService.addDocument(
            "LangChain4J是一个用于构建AI应用的Java框架"
        );
        documentService.addDocument(
            "向量数据库用于存储和检索嵌入向量"
        );

        // When
        ChatResponse response = chatService.chatWithRAG("什么是LangChain4J?");

        // Then
        assertNotNull(response);
        assertNotNull(response.getSources());
        assertFalse(response.getSources().isEmpty());
        assertTrue(
            response.getMessage().contains("Java框架"),
            "回答应该基于检索到的文档"
        );
    }
}

3. E2E测试(TestContainers)

@SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT)
@Testcontainers
@AutoConfigureMockMvc
class ChatFlowE2ETest {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
        .withDatabaseName("testdb");

    @Container
    static GenericContainer<?> qdrant = new GenericContainer<>("qdrant/qdrant:latest")
        .withExposedPorts(6333)
        .waitingFor(Wait.forHttp("/health").forStatusCode(200));

    @Autowired
    private MockMvc mockMvc;

    @Autowired
    private ObjectMapper objectMapper;

    @DynamicPropertySource
    static void configureProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);

        registry.add("langchain4j.qdrant.host", qdrant::getHost);
        registry.add("langchain4j.qdrant.port", qdrant::getFirstMappedPort);
    }

    @Test
    @DisplayName("完整的RAG流程E2E测试")
    void completeRAGFlowE2E() throws Exception {
        // Step 1: 上传文档
        MockMultipartFile file = new MockMultipartFile(
            "file",
            "test.txt",
            "text/plain",
            "LangChain4J是一个Java AI框架".getBytes()
        );

        mockMvc.perform(multipart("/api/documents")
                .file(file))
            .andExpect(status().isOk())
            .andExpect(jsonPath("$.documentId").exists());

        // Step 2: 等待文档被索引
        Thread.sleep(2000);

        // Step 3: 发送查询
        ChatRequest request = new ChatRequest();
        request.setMessage("什么是LangChain4J?");
        request.setUseRAG(true);

        MvcResult result = mockMvc.perform(post("/api/chat")
                .contentType(MediaType.APPLICATION_JSON)
                .content(objectMapper.writeValueAsString(request)))
            .andExpect(status().isOk())
            .andExpect(jsonPath("$.message").exists())
            .andExpect(jsonPath("$.sources").isArray())
            .andReturn();

        // Step 4: 验证响应
        ChatResponse response = objectMapper.readValue(
            result.getResponse().getContentAsString(),
            ChatResponse.class
        );

        assertTrue(response.getSources().size() > 0);
        assertTrue(response.getMessage().contains("Java"));
    }

    @Test
    @DisplayName("限流机制E2E测试")
    void rateLimitingE2E() throws Exception {
        ChatRequest request = new ChatRequest();
        request.setMessage("测试消息");

        // 发送多个请求直到触发限流
        for (int i = 0; i < 15; i++) {
            ResultActions result = mockMvc.perform(post("/api/chat")
                .contentType(MediaType.APPLICATION_JSON)
                .content(objectMapper.writeValueAsString(request)));

            if (i < 10) {
                result.andExpect(status().isOk());
            } else {
                result.andExpect(status().isTooManyRequests());
            }
        }
    }

    @Test
    @DisplayName("健康检查E2E测试")
    void healthCheckE2E() throws Exception {
        mockMvc.perform(get("/actuator/health"))
            .andExpect(status().isOk())
            .andExpect(jsonPath("$.status").value("UP"))
            .andExpect(jsonPath("$.components.db.status").value("UP"))
            .andExpect(jsonPath("$.components.vectorStore.status").value("UP"));
    }
}

🐳 部署和运维

1. Dockerfile

# 多阶段构建
FROM gradle:8.5-jdk21 AS builder

WORKDIR /app

# 复制构建文件
COPY build.gradle.kts settings.gradle.kts ./
COPY src ./src

# 构建应用
RUN gradle clean build -x test --no-daemon

# 运行时镜像
FROM eclipse-temurin:21-jre-alpine

# 添加非root用户
RUN addgroup -S spring && adduser -S spring -G spring

WORKDIR /app

# 复制构建产物
COPY --from=builder /app/build/libs/*.jar app.jar

# 修改所有者
RUN chown -R spring:spring /app

# 切换到非root用户
USER spring:spring

# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=60s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8080/actuator/health || exit 1

# 暴露端口
EXPOSE 8080

# JVM参数优化
ENV JAVA_OPTS="-XX:+UseContainerSupport \
               -XX:MaxRAMPercentage=75.0 \
               -XX:+UseG1GC \
               -XX:+ExitOnOutOfMemoryError \
               -Djava.security.egd=file:/dev/./urandom"

# 启动应用
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]

2. docker-compose.yml

version: '3.8'

services:
  app:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SPRING_PROFILES_ACTIVE=prod
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - DATABASE_URL=jdbc:postgresql://postgres:5432/langchain4j
      - DATABASE_USERNAME=langchain4j
      - DATABASE_PASSWORD=${DB_PASSWORD}
      - QDRANT_HOST=qdrant
      - QDRANT_PORT=6333
    depends_on:
      postgres:
        condition: service_healthy
      qdrant:
        condition: service_started
    restart: unless-stopped
    networks:
      - langchain4j-network
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G
        reservations:
          cpus: '1'
          memory: 1G

  postgres:
    image: pgvector/pgvector:pg16
    environment:
      - POSTGRES_DB=langchain4j
      - POSTGRES_USER=langchain4j
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U langchain4j"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - langchain4j-network

  qdrant:
    image: qdrant/qdrant:v1.7.4
    ports:
      - "6333:6333"
      - "6334:6334"
    volumes:
      - qdrant-data:/qdrant/storage
    networks:
      - langchain4j-network

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    networks:
      - langchain4j-network

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    volumes:
      - grafana-data:/var/lib/grafana
    networks:
      - langchain4j-network

volumes:
  postgres-data:
  qdrant-data:
  prometheus-data:
  grafana-data:

networks:
  langchain4j-network:
    driver: bridge

3. Kubernetes部署

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: langchain4j-app
  namespace: production
  labels:
    app: langchain4j
    version: v1.0.0
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: langchain4j
  template:
    metadata:
      labels:
        app: langchain4j
        version: v1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/actuator/prometheus"
    spec:
      serviceAccountName: langchain4j-sa

      # 初始化容器 - 等待数据库就绪
      initContainers:
      - name: wait-for-postgres
        image: busybox:1.36
        command:
        - sh
        - -c
        - |
          until nc -z postgres-service 5432; do
            echo "Waiting for PostgreSQL..."
            sleep 2
          done

      containers:
      - name: langchain4j
        image: your-registry.com/langchain4j:v1.0.0
        imagePullPolicy: Always

        ports:
        - name: http
          containerPort: 8080
          protocol: TCP

        env:
        - name: SPRING_PROFILES_ACTIVE
          value: "prod"
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: langchain4j-secrets
              key: openai-api-key
        - name: DATABASE_URL
          valueFrom:
            configMapKeyRef:
              name: langchain4j-config
              key: database-url
        - name: DATABASE_USERNAME
          valueFrom:
            secretKeyRef:
              name: langchain4j-secrets
              key: db-username
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: langchain4j-secrets
              key: db-password

        # 资源限制
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "2000m"
            memory: "2Gi"

        # 存活探针
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 3

        # 就绪探针
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

        # 启动探针
        startupProbe:
          httpGet:
            path: /actuator/health
            port: 8080
          initialDelaySeconds: 0
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 30

        # 优雅关闭
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]

      # 优雅关闭时间
      terminationGracePeriodSeconds: 30

      # Pod反亲和性 - 不同节点部署
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - langchain4j
              topologyKey: kubernetes.io/hostname

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: langchain4j-service
  namespace: production
  labels:
    app: langchain4j
spec:
  type: ClusterIP
  selector:
    app: langchain4j
  ports:
  - name: http
    port: 80
    targetPort: 8080
    protocol: TCP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: langchain4j-config
  namespace: production
data:
  database-url: "jdbc:postgresql://postgres-service:5432/langchain4j"
  qdrant-host: "qdrant-service"
  qdrant-port: "6333"

  application.yml: |
    server:
      port: 8080
      shutdown: graceful

    spring:
      application:
        name: langchain4j-app

      lifecycle:
        timeout-per-shutdown-phase: 20s

    management:
      endpoints:
        web:
          exposure:
            include: health,prometheus,info,metrics
      health:
        livenessState:
          enabled: true
        readinessState:
          enabled: true
      metrics:
        export:
          prometheus:
            enabled: true

    langchain4j:
      model-name: gpt-4
      temperature: 0.7
      max-tokens: 2000
      timeout: 60
      max-retries: 3

secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: langchain4j-secrets
  namespace: production
type: Opaque
stringData:
  openai-api-key: "sk-proj-xxxxxxxxxxxxx"
  db-username: "langchain4j"
  db-password: "your-secure-password"

ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: langchain4j-ingress
  namespace: production
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - api.example.com
    secretName: langchain4j-tls
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: langchain4j-service
            port:
              number: 80

📋 50条最佳实践清单

代码质量(10条)

  1. 使用构建器模式:所有LangChain4J对象都使用.builder()模式,提高可读性
  2. 接口隔离:为LLM交互定义清晰的接口,方便mock和测试
  3. 异常处理分层:业务异常、技术异常、外部服务异常分开处理
  4. 日志规范:使用结构化日志,包含请求ID、用户ID、操作类型
  5. 配置外部化:所有配置项通过配置文件或环境变量注入
  6. 版本控制:API版本化(/v1/chat),向后兼容
  7. 代码复用:提取通用的prompt模板、工具函数到共享模块
  8. 类型安全:使用强类型而非Map<String, Object>
  9. 不可变对象:领域模型使用不可变对象,线程安全
  10. 文档注释:复杂的prompt工程、RAG配置需要详细注释

性能优化(10条)

  1. 嵌入向量缓存:相同文本的embedding结果缓存到Redis
  2. 批量处理:文档处理使用批量API,减少网络往返
  3. 异步处理:长时间操作使用异步,返回任务ID
  4. 连接池配置:数据库、HTTP客户端使用合理的连接池大小
  5. 分页查询:向量检索使用分页,避免一次加载过多数据
  6. 懒加载:对话历史按需加载,不是全部加载
  7. 压缩传输:启用GZIP压缩,减少网络传输
  8. 索引优化:向量数据库索引类型选择(HNSW vs IVF)
  9. 热点数据预热:系统启动时预加载常用embedding
  10. 超时设置:所有外部调用设置合理超时(LLM 60s,DB 5s)

成本控制(10条)

  1. Token计数:请求前估算token数,避免超出限制
  2. 模型选择:根据任务复杂度选择模型(简单任务用GPT-3.5)
  3. Prompt优化:精简system prompt,减少不必要的token
  4. 结果缓存:相似问题的答案缓存30分钟
  5. 限流分级:不同用户级别不同的配额
  6. 批量嵌入:使用batch embedding API,成本更低
  7. 向量维度:根据需求选择embedding维度(768 vs 1536)
  8. 监控告警:API调用成本超过阈值时告警
  9. 降级策略:成本超标时降级到更便宜的模型
  10. 定期审计:每周审计API使用情况,优化高成本查询

可靠性(10条)

  1. 重试机制:API调用失败自动重试,指数退避
  2. 熔断器:连续失败时熔断,保护下游服务
  3. 降级方案:LLM不可用时返回预设回答
  4. 幂等性:所有写操作支持幂等,防止重复提交
  5. 事务管理:向量存储和元数据存储保持一致性
  6. 健康检查:检查LLM、数据库、向量存储的健康状态
  7. 优雅关闭:接收到SIGTERM时,完成当前请求再关闭
  8. 数据备份:向量数据和对话历史定期备份
  9. 故障恢复:系统重启后自动恢复未完成的任务
  10. 多区域部署:关键服务部署在多个可用区

安全性(10条)

  1. API密钥加密:生产环境使用Vault等密钥管理服务
  2. 输入验证:严格验证用户输入,防止注入攻击
  3. 输出过滤:过滤敏感信息(PII、API密钥)
  4. 访问控制:基于角色的权限控制(RBAC)
  5. 审计日志:记录所有敏感操作(创建、删除、修改)
  6. HTTPS强制:生产环境强制使用HTTPS
  7. CORS配置:严格配置跨域请求来源
  8. SQL注入防护:使用参数化查询,不拼接SQL
  9. XSS防护:输出时HTML转义,防止跨站脚本攻击
  10. 定期更新:及时更新依赖库,修复安全漏洞

🎯 项目Checklist

生产就绪检查清单

安全性

  • API密钥不在代码中硬编码
  • 生产环境使用Vault或类似服务管理密钥
  • 启用HTTPS和TLS 1.2+
  • 实施输入验证和输出过滤
  • 配置CORS白名单
  • 启用限流保护
  • 审计日志记录所有敏感操作

可靠性

  • 健康检查端点正常工作
  • 配置重试和熔断机制
  • 实现优雅关闭
  • 数据库连接池配置合理
  • 设置合理的超时时间
  • 配置资源限制(CPU、内存)

监控和可观测性

  • Prometheus指标暴露
  • 关键操作有日志记录
  • 配置告警规则
  • 设置链路追踪
  • 错误率监控
  • 性能指标监控(延迟、吞吐量)

测试

  • 单元测试覆盖率 > 80%
  • 集成测试覆盖主要流程
  • E2E测试验证关键场景
  • 压力测试验证性能
  • 安全测试(渗透测试)

文档

  • API文档完整(OpenAPI/Swagger)
  • 部署文档清晰
  • 故障排查指南
  • 架构决策记录(ADR)
  • 运维手册

成本优化

  • 配置结果缓存
  • 根据任务选择合适的模型
  • 设置成本告警
  • 定期审计API使用

合规性

  • GDPR/隐私保护合规
  • 数据保留策略
  • 用户数据导出/删除功能
  • 服务条款和隐私政策

🚀 从Demo到Production

Demo阶段特征

// Demo代码示例
public class DemoChat {
    public static void main(String[] args) {
        // ❌ 硬编码API密钥
        String apiKey = "sk-proj-xxxxx";

        // ❌ 直接使用,没有错误处理
        ChatLanguageModel model = OpenAiChatModel.builder()
            .apiKey(apiKey)
            .modelName("gpt-4")
            .build();

        // ❌ 没有输入验证
        String response = model.generate("用户输入");

        // ❌ 直接输出,没有过滤
        System.out.println(response);
    }
}

Production改造

// 生产级代码
@Service
@Slf4j
public class ProductionChatService {

    private final ChatLanguageModel model;
    private final InputValidator validator;
    private final OutputSanitizer sanitizer;
    private final MetricsCollector metrics;
    private final CircuitBreaker circuitBreaker;

    // ✅ 依赖注入,配置外部化
    public ProductionChatService(
            ChatLanguageModel model,
            InputValidator validator,
            OutputSanitizer sanitizer,
            MetricsCollector metrics,
            CircuitBreakerRegistry circuitBreakerRegistry) {
        this.model = model;
        this.validator = validator;
        this.sanitizer = sanitizer;
        this.metrics = metrics;
        this.circuitBreaker = circuitBreakerRegistry.circuitBreaker("chat-service");
    }

    @Transactional
    @RateLimited
    public ChatResponse chat(ChatRequest request) {
        // ✅ 参数验证
        validator.validateChatInput(request.getMessage());

        // ✅ 记录指标
        Timer.Sample sample = Timer.start(metrics.getRegistry());

        try {
            // ✅ 使用熔断器保护
            String response = circuitBreaker.executeSupplier(() -> {
                try {
                    return model.generate(request.getMessage());
                } catch (Exception e) {
                    log.error("LLM调用失败", e);
                    throw new ChatServiceException("AI服务暂时不可用", e);
                }
            });

            // ✅ 输出过滤
            String sanitized = sanitizer.sanitizeOutput(response);

            // ✅ 记录成功
            sample.stop(metrics.timer("chat.success"));

            return ChatResponse.builder()
                .message(sanitized)
                .timestamp(Instant.now())
                .build();

        } catch (CallNotPermittedException e) {
            // ✅ 熔断降级
            log.warn("熔断器打开,返回降级响应");
            sample.stop(metrics.timer("chat.circuit_open"));
            return getFallbackResponse();

        } catch (Exception e) {
            // ✅ 错误处理
            log.error("聊天服务异常", e);
            sample.stop(metrics.timer("chat.error"));
            throw new ChatServiceException("处理请求时发生错误", e);
        }
    }

    private ChatResponse getFallbackResponse() {
        return ChatResponse.builder()
            .message("抱歉,AI服务当前繁忙,请稍后重试")
            .isFallback(true)
            .timestamp(Instant.now())
            .build();
    }
}

迁移检查表

1. 配置管理

DemoProduction
硬编码配置环境变量/配置文件
单一配置多环境配置(dev/test/prod)
明文密钥Vault/加密

2. 错误处理

DemoProduction
没有异常处理完整的try-catch
打印堆栈结构化日志
应用崩溃优雅降级

3. 可靠性

DemoProduction
单次调用重试+熔断
同步阻塞异步+超时
无监控完整监控告警

4. 性能

DemoProduction
无缓存多层缓存
串行处理批量+异步
无限流限流+降级

5. 安全

DemoProduction
无验证输入验证+输出过滤
无认证JWT/OAuth2
HTTPHTTPS + 证书

6. 测试

DemoProduction
手工测试自动化测试套件
无测试80%+覆盖率
本地验证CI/CD流水线

💡 实战练习

练习1:构建生产级RAG系统

需求

  • 支持PDF、Word、TXT文档上传
  • 使用PGVector存储向量
  • 实现缓存和限流
  • 完整的测试覆盖

提示

  1. 先定义接口和测试(TDD)
  2. 实现文档解析和向量化
  3. 添加缓存层(Caffeine + Redis)
  4. 实现限流和监控
  5. 编写集成测试

练习2:实现对话上下文管理

需求

  • 维护用户对话历史
  • 支持会话摘要(长对话)
  • 自动清理过期会话
  • 持久化到数据库

提示

  1. 设计Conversation和Message实体
  2. 实现滑动窗口策略
  3. 集成ChatMemory
  4. 添加定时清理任务

练习3:部署到Kubernetes

需求

  • 编写完整的K8s配置
  • 配置HPA(水平扩展)
  • 实现滚动更新
  • 配置监控告警

提示

  1. 从Dockerfile开始
  2. 编写deployment和service
  3. 配置ConfigMap和Secret
  4. 使用Helm管理配置
  5. 配置Prometheus监控

最后更新:2026-03-09 字数统计:5,500 字 预计阅读时间:45 分钟