Apache Polaris权威指南——Polaris REST API

200 阅读19分钟

在本章中,我们将深入介绍 Apache Polaris 提供的 REST API,用于管理 catalogs、roles、namespaces、tables 与 views。Polaris REST API 让你能够以编程方式、无缝地与湖仓的目录层交互,更轻松地在分布式环境中编排复杂操作。

无论是新增 catalog、通过 roles 与 principals 定义访问控制,还是对命名空间与表执行细粒度操作,Polaris REST API 都提供了高效处理这些任务的灵活性。借助该 API,团队可以自动化目录管理,以确保平台的可扩展性、一致性与更佳治理

我们将逐一拆解各个端点,说明其用途,并通过 cURLPython requests 展示其用法。读完本章后,你将充分理解如何以编程方式管理你的 Polaris 部署并将其融入现有工作流。

以下小节将按功能分组,专门讲解 Polaris Management REST API 的端点(这与 Apache Icicle REST catalog 规范及其端点不同)。此外,Polaris 仓库内置了一个 Python CLI,无需编写自定义脚本或使用 cURL、Postman、Insomnia 等 HTTP 客户端,即可执行大多数任务。

你随时可以在此查看该 REST 规范的最新版本:
github.com/apache/pola…

第 5.1–5.3 节中的所有端点都以 /api/management/v1 作为前缀。例如, /catalogs 实际为 /api/management/v1/catalogs

Catalog 操作(Catalog Operations)

Polaris REST API 提供对 catalog 的 CRUD(创建、读取、更新、删除)端点。这些操作构成目录管理的基础,使你能以编程方式定义、查询、更新与移除 catalogs。

列出 Catalog(List Catalogs)

检索 Polaris 部署中的所有 catalogs 列表。
Endpoint:
GET /catalogs

示例 cURL:

curl -X GET https://polaris.example.com/api/management/v1/catalogs \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json"

示例 Python:

import requests
url = "https://polaris.example.com/api/management/v1/catalogs"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
response = requests.get(url, headers=headers)
print(response.json())

示例响应:

{
  "catalogs": [
    {
      "type": "INTERNAL",
      "name": "example_catalog",
      "properties": {
        "default-base-location": "s3://bucket/path"
      },
      "createTimestamp": 1622547800000,
      "lastUpdateTimestamp": 1622547900000,
      "entityVersion": 1,
      "storageConfigInfo": {
        "storageType": "S3",
        "allowedLocations": "For AWS [s3://bucketname/prefix/], 
        for AZURE [abfss://container@storageaccount.blob.core.windows.net
        /prefix/], 
        for GCP [gs://bucketname/prefix/]"
      }
    }
  ]
}

创建 Catalog(Create a Catalog)

向 Polaris 部署中新增一个 catalog。catalog 可为 internalexternal
Endpoint:
POST /catalogs

示例 cURL:

curl -X POST https://polaris.example.com/api/management/v1/catalogs \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json" \
     -d '{
           "catalog": {
               "type": "INTERNAL",
               "name": "example_catalog",
               "properties": {
                   "default-base-location": "s3://bucket/path"
               }
           }
         }'

示例 Python:

import requests
url = "https://polaris.example.com/api/management/v1/catalogs"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
payload = {
    "catalog": {
        "type": "INTERNAL",
        "name": "example_catalog",
        "properties": {
            "default-base-location": "s3://bucket/path"
        }
    }
}
response = requests.post(url, headers=headers, json=payload)
print(response.status_code, response.json())

示例响应:

{
  "catalog": {
    "type": "INTERNAL",
    "name": "example_catalog",
    "properties": {
      "default-base-location": "s3://bucket/path"
    },
    "createTimestamp": 1533547800000,
    "lastUpdateTimestamp": 1627647800000,
    "entityVersion": 1
  }
}

获取 Catalog 详情(Get Catalog Details)

按名称获取指定 catalog 的详细信息。
Endpoint:
GET /catalogs/{catalogName}

示例 cURL:

curl -X GET https://polaris.example.com/api/management/v1
/catalogs/example_catalog \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json"

示例 Python:

import requests
url = "https://polaris.example.com/api/management/v1/catalogs/example_catalog"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
response = requests.get(url, headers=headers)
print(response.json())

示例响应:

{
  "type": "INTERNAL",
  "name": "example_catalog",
  "properties": {
    "default-base-location": "s3://bucket/path",
    "property1": "value1",
    "property2": "value2"
  },
  "createTimestamp": 143547800000,
  "lastUpdateTimestamp": 1538547900000,
  "entityVersion": 1
}

更新 Catalog(Update a Catalog)

更新已存在的 catalog 详情。请求体必须包含该 catalog 当前的 entityVersion
Endpoint:
PUT /catalogs/{catalogName}

示例 cURL:

curl -X PUT https://polaris.example.com/api/management/v1
/catalogs/example_catalog \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json" \
     -d '{
           "currentEntityVersion": 1,
           "properties": {
               "default-base-location": "s3://new_bucket/path"
           }
         }'

示例 Python:

import requests
url = "https://polaris.example.com/api/management/v1/catalogs/example_catalog"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
payload = {
    "currentEntityVersion": 1,
    "properties": {
        "default-base-location": "s3://new_bucket/path"
    }
}
response = requests.put(url, headers=headers, json=payload)
print(response.status_code, response.json())

示例响应:

{
  "type": "INTERNAL",
  "name": "example_catalog",
  "properties": {
    "default-base-location": "s3://new_bucket/path",
    "property1": "value1",
    "property2": "value2"
  },
  "createTimestamp": 1622547800000,
  "lastUpdateTimestamp": 1622548000000,
  "entityVersion": 2
}

删除 Catalog(Delete a Catalog)

删除现有 catalog。删除前 catalog 必须为空。
Endpoint:
DELETE /catalogs/{catalogName}

示例 cURL:

curl -X DELETE https://polaris.example.com/api/management/v1/catalogs
/example_catalog \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json"

示例 Python:

import requests

url = "https://polaris.example.com/api/management/v1/catalogs/example_catalog"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
response = requests.delete(url, headers=headers)
print(response.status_code)

示例响应:
成功时为空响应。

主体相关操作(Principal Operations)

在 Apache Polaris 中,主体(principals) 代表与系统交互的实体(如用户或服务)。REST API 提供用于管理主体的 CRUD(创建、读取、更新、删除) 操作,以及轮换凭证的能力。本节将逐一介绍这些端点,并给出示例请求与响应。

列出主体(List Principals)

检索 Polaris 目录中当前可用的所有主体。
Endpoint:
GET /principals

示例 cURL 请求:

curl -X GET \
  https://polaris.example.com/api/management/v1/principals \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

示例 Python 请求:

import requests
url = "https://polaris.example.com/api/management/v1/principals"
headers = {"Authorization": "Bearer <ACCESS_TOKEN>"}

response = requests.get(url, headers=headers)
print(response.json())

示例响应:

{
"principal": {
"name": "string",
"clientId": "string",
"properties": {
"property1": "string",
"property2": "string"
},
"createTimestamp": 0,
"lastUpdateTimestamp": 0,
"entityVersion": 0
},
"credentials": {
"clientId": "string",
"clientSecret": "pa$$word"
}
}

创建主体(Create a Principal)

为与 Polaris 系统交互创建一个新的主体。
Endpoint:
POST /principals

示例 cURL 请求:

curl -X POST \
  https://polaris.example.com/api/management/v1/principals \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "principal": {
          "name": "data_analyst",
          "clientId": "analyst321",
          "properties": {
            "team": "analytics",
            "region": "us-west"
          }
        },
        "credentialRotationRequired": true
      }'

示例 Python 请求:

import requests
url = "https://polaris.example.com/api/management/v1/principals"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
data = {
    "principal": {
        "name": "data_analyst",
        "clientId": "analyst321",
        "properties": {
            "team": "analytics",
            "region": "us-west"
        }
    },
    "credentialRotationRequired": True
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

示例响应:

{
  "principal": {
    "name": "data_analyst",
    "clientId": "analyst321",
    "properties": {
      "team": "analytics",
      "region": "us-west"
    },
    "createTimestamp": 1694372200000,
    "lastUpdateTimestamp": 1694372200000,
    "entityVersion": 1
  },
  "credentials": {
    "clientId": "analyst321",
    "clientSecret": "secure$Password123"
  }
}

获取主体详情(Get Principal Details)

按名称获取指定主体的详细信息。
Endpoint:
GET /principals/{principalName}

示例 cURL 请求:

curl -X GET \
  https://polaris.example.com/api/management/v1/principals/data_analyst \
  -H "Authorization": "Bearer <ACCESS_TOKEN>"

示例 Python 请求:

import requests
url = "https://polaris.example.com/api/management/v1/principals/data_analyst"
headers = {"Authorization": "Bearer <ACCESS_TOKEN>"}

response = requests.get(url, headers=headers)
print(response.json())

示例响应:

{
  "name": "data_analyst",
  "clientId": "analyst321",
  "properties": {
    "team": "analytics",
    "region": "us-west"
  },
  "createTimestamp": 1694372200000,
  "lastUpdateTimestamp": 1694378400000,
  "entityVersion": 1
}

更新主体(Update a Principal)

更新现有主体的详细信息,如属性或版本。
Endpoint:
PUT /principals/{principalName}

示例 cURL 请求:

curl -X PUT \
  https://polaris.example.com/api/management/v1/principals/data_analyst \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "currentEntityVersion": 1,
        "properties": {
          "team": "analytics",
          "region": "us-east",
          "project": "forecasting"
        }
      }'

示例 Python 请求:

import requests
url = "https://polaris.example.com/api/management/v1/principals/data_analyst"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
data = {
    "currentEntityVersion": 1,
    "properties": {
        "team": "analytics",
        "region": "us-east",
        "project": "forecasting"
    }
}

response = requests.put(url, headers=headers, json=data)
print(response.json())

示例响应:

{
  "name": "data_analyst",
  "clientId": "analyst321",
  "properties": {
    "team": "analytics",
    "region": "us-east",
    "project": "forecasting"
  },
  "createTimestamp": 1694372200000,
  "lastUpdateTimestamp": 1694378400000,
  "entityVersion": 2
}

删除主体(Delete a Principal)

从系统中删除一个现有主体。
Endpoint:
DELETE /principals/{principalName}

示例 cURL 请求:

curl -X DELETE \
  https://polaris.example.com/api/management/v1/principals/data_analyst \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

示例 Python 请求:

import requests
url = "https://polaris.example.com/api/management/v1/principals/data_analyst"
headers = {"Authorization": "Bearer <ACCESS_TOKEN>>"}

response = requests.delete(url, headers=headers)
print(response.status_code)

若成功,响应体为空。

轮换主体凭证(Rotate Principal Credentials)

为指定主体轮换凭证,并返回新凭证。
Endpoint:
POST /principals/{principalName}/rotate

示例 cURL 请求:

curl -X POST \
  https://polaris.example.com/api/management/v1/principals/data_analyst/rotate \
  -H "Authorization: Bearer <YOUR_ACCESS_TOKEN>"

示例 Python 请求:

import requests
url = "https://polaris.example.com/api/management/v1/principals
/data_analyst/rotate"
headers = {"Authorization": "Bearer <ACCESS_TOKEN>"}

response = requests.post(url, headers=headers)
print(response.json())

示例响应:

{
  "principal": {
    "name": "data_analyst",
    "clientId": "analyst321",
    "properties": {
      "team": "analytics",
      "region": "us-east",
      "project": "forecasting"
    },
    "createTimestamp": 1694372200000,
    "lastUpdateTimestamp": 1694378400000,
    "entityVersion": 2
  },
  "credentials": {
    "clientId": "analyst321",
    "clientSecret": "new$Password456"
  }
}

以上端点覆盖了在 Apache Polaris 中创建、更新、删除与列出主体的全部所需操作。

管理角色(Managing Roles)

Apache Polaris 提供一组 API,用于在 catalogsprincipals 之间管理权限。通过这些 API,你可以授予、列出与撤销特权,从而在整个数据湖仓中实现安全且高效的访问管理。本节按端点逐一说明,并附带示例请求与响应。

创建 Catalog 角色(Create a Catalog Role)

在某个 catalog 内创建新角色。
Endpoint:
POST /catalogs/{catalogName}/catalog-roles

示例 cURL:

curl -X POST \
  https://polaris.example.com/api/management/v1/catalogs/finance_catalog/catalog-roles \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "catalogRole": {
          "name": "viewer",
          "properties": {
            "permissions": "read_only"
          }
        }
      }'

创建主体角色(Create a Principal Role)

创建新的主体角色。
Endpoint:
POST /principal-roles

示例 cURL:

curl -X POST \
  https://polaris.example.com/api/management/v1/principal-roles \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "principalRole": {
          "name": "report_viewer",
          "properties": {
            "access_scope": "read_only"
          }
        }
      }'

列出 Catalog 角色(List Catalog Roles)

列出某个 catalog 中的全部角色。
Endpoint:
GET /catalogs/{catalogName}/catalog-roles

示例 cURL:

curl -X GET \
  https://polaris.example.com/api/management/v1/catalogs/finance_catalog/catalog-roles \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

列出分配给某主体的角色(List Roles Assigned to a Principal)

检索分配给指定主体的全部主体角色。
Endpoint:
GET /principals/{principalName}/principal-roles

示例 cURL:

curl -X GET \
  https://polaris.example.com/api/management/v1/principals/john_doe/principal-roles \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

列出所有主体角色(List All Principal Roles)

检索系统中的全部主体角色列表。
Endpoint:
GET /principal-roles

示例 cURL:

curl -X GET \
  https://polaris.example.com/api/management/v1/principal-roles \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

列出被某主体角色分配的主体(List Principals Assigned to a Principal Role)

检索分配了特定主体角色的主体列表。
Endpoint:
GET /principal-roles/{principalRoleName}/principals

示例 cURL:

curl -X GET \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer/principals \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

获取映射到某主体角色的 Catalog 角色(Get Catalog Roles Mapped to a Principal Role)

检索在指定 catalog 中映射到特定主体角色的 catalog 角色。
Endpoint:
GET /principal-roles/{principalRoleName}/catalog-roles/{catalogName}

示例 cURL:

curl -X GET \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer/catalog-roles/finance_catalog \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

获取主体角色详情(Get Details of a Principal Role)

检索特定主体角色的详细信息。
Endpoint:
GET /principal-roles/{principalRoleName}

示例 cURL:

curl -X GET \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

为 Catalog 角色添加授权(Add a Grant to a Catalog Role)

为某个 catalog 角色添加授权,决定其对该 catalog 资产的访问类型。
Endpoint:
PUT /catalogs/{catalogName}/catalog-roles/{catalogRoleName}/grants

示例 cURL:

curl -X PUT "https://polaris.example.com/api/management/v1/catalogs/finance/catalog-roles/analyst/grants" \
-H "Authorization: Bearer <ACCESS_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
  "grant": {
    "type": "catalog",
    "privilege": "CATALOG_MANAGE_CONTENT"
  }
}'

从 Catalog 角色撤销授权(Revoke a Grant from a Catalog Role)

从某个 Catalog 角色撤销权限。
Endpoint:
DELETE /catalogs/{catalogName}/catalog-roles/{catalogRoleName}/grants

示例 cURL:

curl -X DELETE "https://polaris.example.com/api/management/v1/catalogs/finance/catalog-roles/analyst/grants?cascade=true" \
-H "Authorization: Bearer <ACCESS_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
  "grant": {
    "type": "catalog",
    "privilege": "CATALOG_MANAGE_CONTENT"
  }
}'

将 Catalog 角色分配给主体角色(Assign a Catalog Role to a Principal Role)

在指定 catalog 中,将某个 catalog 角色分配给主体角色。
Endpoint:
PUT /principal-roles/{principalRoleName}/catalog-roles/{catalogName}

示例 cURL:

curl -X PUT \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer/catalog-roles/finance_catalog \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "catalogRole": {
          "name": "editor",
          "properties": {
            "edit_scope": "limited"
          }
        }
      }'

为主体分配角色(Assign a Role to a Principal)

将某个主体角色分配给主体。
Endpoint:
PUT /principals/{principalName}/principal-roles

示例 cURL:

curl -X PUT \
  https://polaris.example.com/api/management/v1/principals/john_doe/principal-roles \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "principalRole": {
          "name": "data_admin",
          "properties": {
            "access_level": "full"
          }
        }
      }'

更新主体角色(Update a Principal Role)

更新现有主体角色的详细信息。
Endpoint:
PUT /principal-roles/{principalRoleName}

示例 cURL:

curl -X PUT \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "currentEntityVersion": 2,
        "properties": {
          "access_scope": "write_enabled"
        }
      }'

从主体撤销角色(Revoke a Role from a Principal)

从主体移除某个主体角色。
Endpoint:
DELETE /principals/{principalName}/principal-roles/{principalRoleName}

示例 cURL:

curl -X DELETE \
  https://polaris.example.com/api/management/v1/principals/john_doe/principal-roles/data_admin \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

从主体角色撤销 Catalog 角色(Revoke a Catalog Role from a Principal Role)

在指定 catalog 中,从主体角色移除某个 catalog 角色。
Endpoint:
DELETE /principal-roles/{principalRoleName}/catalog-roles/{catalogName}/{catalogRoleName}

示例 cURL:

curl -X DELETE \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer/catalog-roles/finance_catalog/editor \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

删除主体角色(Delete a Principal Role)

删除现有主体角色。
Endpoint:
DELETE /principal-roles/{principalRoleName}

示例 cURL:

curl -X DELETE \
  https://polaris.example.com/api/management/v1/principal-roles/report_viewer \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

删除 Catalog 角色(Delete a Catalog Role)

删除现有 catalog 角色。
Endpoint:
DELETE /catalogs/{catalogName}/catalog-roles/{catalogRoleName}

示例 cURL:

curl -X DELETE \
 https://polaris.example.com/api/management/v1/catalogs/finance_catalog/catalog-roles/viewer \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

以上端点涵盖了 catalog 角色主体角色 的管理,帮助你在 Apache Polaris 中高效实施访问控制

Apache Iceberg REST 目录端点(Apache Iceberg REST Catalog Endpoints)

本节介绍 Apache Polaris 提供、并实现 Apache Iceberg REST Catalog 规范 的端点。这些端点使客户端能够以标准化、可互操作的方式与 catalog、namespace、table 与 view 交互。

在 Polaris 中,每个端点都以 /api/catalog/ 作为前缀。例如,Iceberg 规范中的 /v1/config,在 Polaris 中访问路径为 /api/catalog/v1/config

这些 API 旨在与遵循 Iceberg REST 标准的工具与引擎兼容,从而支持如模式管理、建表与视图控制等高级目录操作。尽管 Polaris 还提供了其自身的管理 API(见第 5.1–5.3 节),但 Iceberg REST Catalog 端点带来了更广泛的生态互操作性以及与外部引擎的集成能力。

下文按端点分组进行说明:先给出用途概述,再给出对应的 API 定义。

Iceberg REST Catalog 规范的最新版本参见:
github.com/apache/iceb…

配置 API(Configuration API)

用于获取由服务器提供的默认与覆盖配置属性,以初始化和管理 Iceberg 目录客户端。
Endpoint: GET /v1/config

客户端应首先调用该端点,以获取决定目录客户端行为的默认与覆盖属性。响应包含:

  • Defaults(默认) :在客户端配置之前应用的设置
  • Overrides(覆盖) :在客户端配置之后应用的设置
  • Optional endpoints list(可选端点列表) :列出支持的 Iceberg REST 路由

这些值会指导客户端行为,如连接池大小、端点路由、仓库(warehouse)位置等。更多信息见 Iceberg 关于目录属性的文档。

示例 cURL:

curl -X GET https://polaris.example.com/api/catalog/v1/config \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json"

示例 Python:

import requests

url = "https://polaris.example.com/api/catalog/v1/config"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}

response = requests.get(url, headers=headers)
print(response.json())

示例响应:

{
  "defaults": {
    "client.pool.size": "4",
    "catalog.name": "main"
  },
  "overrides": {
    "warehouse": "s3://bucket/warehouse/"
  },
  "endpoints": [
    "GET /v1/{prefix}/namespaces",
    "POST /v1/{prefix}/namespaces",
    "GET /v1/{prefix}/namespaces/{namespace}",
    "DELETE /v1/{prefix}/namespaces/{namespace}",
    "GET /v1/{prefix}/namespaces/{namespace}/tables"
  ]
}

OAuth2 API

使用 OAuth 2.0 的客户端凭证令牌交换流程来交换凭证或令牌。

注意
该端点已被弃用并将被移除;不建议在新的实现中使用。请改用 oauth2-server-uri 配置属性以集成外部身份提供方。

Endpoint: POST /v1/oauth/tokens

该端点最初用于支持三种流程:

  • Client Credentials Flow(客户端凭证) :用 client_id 与 client_secret 交换访问令牌
  • Token Exchange Flow(Actor + Subject) :用客户端令牌与身份令牌交换带有用户上下文的新访问令牌
  • Token Refresh Flow(刷新) :用将过期的令牌交换一个新的、过期时间已刷新 的令牌

以上能力正逐步被更安全、显式配置的 OAuth 集成方式取代。

示例 cURL(客户端凭证流程):

curl -X POST https://polaris.example.com/api/catalog/v1/oauth/tokens \
     -H "Content-Type: application/x-www-form-urlencoded" \
     -d 'grant_type=client_credentials&client_id=your-client-id&client_secret=your-client-secret'

示例 Python:

import requests

url = "https://polaris.example.com/api/catalog/v1/oauth/tokens"
headers = {"Content-Type": "application/x-www-form-urlencoded"}
payload = {
    "grant_type": "client_credentials",
    "client_id": "your-client-id",
    "client_secret": "your-client-secret"
}

response = requests.post(url, headers=headers, data=payload)
print(response.json())

示例响应:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "Bearer",
  "expires_in": 3600
}

表 API(Table API)

对 Iceberg 表执行创建、注册、更新、加载、删除与重命名等操作。
基础路径前缀: /v1/{prefix}/namespaces/{namespace}/tables

表 API 支持以编程方式管理 Iceberg 表的全生命周期:创建新表、按元数据位置注册既有表、提交模式/快照变更、获取表元数据等。可用于摄取流水线自动化、CI/CD 工作流及模式治理。

列出表标识(List Table Identifiers)

检索某命名空间下的全部表标识。
Endpoint: GET /v1/{prefix}/namespaces/{namespace}/tables
支持用 page-sizepage-token 参数进行分页。

示例 cURL:

curl -X GET https://polaris.example.com/api/catalog/v1/dev_team/namespaces/analytics/tables \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json"

示例 Python:

import requests
url = "https://polaris.example.com/api/catalog/v1/dev_team/namespaces/analytics/tables"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
response = requests.get(url, headers=headers)
print(response.json())

示例响应:

{
  "identifiers": [
    {"namespace": ["analytics"], "name": "monthly_sales"},
    {"namespace": ["analytics"], "name": "user_events"},
    {"namespace": ["analytics"], "name": "product_catalog"}
  ]
}

创建命名空间(Create a Namespace)

Endpoint: POST /v1/{prefix}/namespaces
用于创建逻辑命名空间,可选携带元数据属性;命名空间已存在则返回冲突。

示例 cURL:

curl -X POST https://polaris.example.com/api/catalog/v1/dev_team/namespaces \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json" \
     -d '{
           "namespace": ["analytics"],
           "properties": {
             "owner": "data.platform@company.com",
             "created_by": "automation_pipeline"
           }
         }'

示例 Python:

import requests

url = "https://polaris.example.com/api/catalog/v1/dev_team/namespaces"
headers = {
    "Authorization": "Bearer <ACCESS_TOKEN>",
    "Content-Type": "application/json"
}
payload = {
    "namespace": ["analytics"],
    "properties": {
        "owner": "data.platform@company.com",
        "created_by": "automation_pipeline"
    }
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())

示例响应:

{
  "namespace": ["analytics"],
  "properties": {
    "owner": "data.platform@company.com",
    "created_by": "automation_pipeline",
    "last_modified_time": "2025-06-20T13:45:00Z"
  }
}

加载命名空间属性(Load Namespace Properties)

Endpoint: GET /v1/{prefix}/namespaces/{namespace}

示例 cURL:

curl -X GET https://polaris.example.com/api/catalog/v1/dev_team/namespaces/analytics \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json"

示例响应:

{
  "namespace": ["analytics"],
  "properties": {
    "owner": "data.platform@company.com",
    "description": "Namespace for analytics tables and dashboards",
    "last_modified_time": "2025-06-20T14:10:00Z"
  }
}

检查命名空间是否存在(Check Namespace Existence)

Endpoint: HEAD /v1/{prefix}/namespaces/{namespace}
存在返回 204 No Content,不存在返回 404 Not Found

示例 cURL:

curl -I -X HEAD https://polaris.example.com/api/catalog/v1/dev_team/namespaces/analytics \
     -H "Authorization: Bearer <ACCESS_TOKEN>"

删除命名空间(Drop a Namespace)

Endpoint: DELETE /v1/{prefix}/namespaces/{namespace}
命名空间必须为空,否则返回冲突。

示例响应:

  • 204 No Content:删除成功
  • 404 Not Found:命名空间不存在
  • 409 Conflict:命名空间非空

设置或移除命名空间属性(Set or Remove Namespace Properties)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/properties
仅变更请求体中声明的属性;未声明的保持不变。并非所有实现都必须支持命名空间属性。

示例 cURL:

curl -X POST https://polaris.example.com/api/catalog/v1/dev_team/namespaces/analytics/properties \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json" \
     -d '{
           "updates": {
             "owner": "dataops@company.com",
             "environment": "production"
           },
           "removals": ["created_by"]
         }'

示例响应:

{
  "namespace": ["analytics"],
  "properties": {
    "owner": "dataops@company.com",
    "environment": "production"
  }
}

创建表(Create a Table)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/tables
可立即创建(stage-create: false)或分阶段创建stage-create: true,后续通过“提交更新”完成事务)。

示例 cURL(立即创建):

curl -X POST https://polaris.example.com/api/catalog/v1/dev/namespaces/analytics/tables \
     -H "Authorization: Bearer <ACCESS_TOKEN>" \
     -H "Content-Type: application/json" \
     -d '{
           "name": "customer_profiles",
           "schema": {
             "type": "struct",
             "fields": [
               {"id": 1, "name": "customer_id", "required": true, "type": "long"},
               {"id": 2, "name": "email", "required": false, "type": "string"},
               {"id": 3, "name": "created_at", "required": true, "type": "timestamp"}
             ]
           },
           "spec": { "fields": [ { "source-id": 3, "transform": "day", "name": "day_created" } ] },
           "stage-create": false
         }'

示例响应:

{
  "metadata-location": "s3://warehouse/analytics/customer_profiles/metadata/00000.metadata.json",
  "table-uuid": "9bde37b3-fc9e-4b32-9b9e-abc123def456"
}

注册表(Register a Table)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/register
元数据 JSON 文件路径注册一个已存在的 Iceberg 表(常见于迁移或恢复场景)。

示例响应:

{
  "metadata-location": "s3://warehouse/analytics/recovered_events/metadata/00000.metadata.json",
  "table-uuid": "52f7ab2e-47d1-4d96-9f5c-d6e12a77de2a"
}

加载表元数据(Load Table Metadata)

Endpoint: GET /v1/{prefix}/namespaces/{namespace}/tables/{table}
可选 If-None-Match 返回 304;也可用 snapshots 参数控制返回的快照范围。

示例响应(节选):

{
  "metadata-location": "s3://warehouse/analytics/customer_profiles/metadata/00001.metadata.json",
  "table-uuid": "9bde37b3-fc9e-4b32-9b9e-abc123def456",
  "schema": { ... },
  "spec": { ... },
  "last-sequence-number": 7,
  "last-snapshot-id": 7452913019,
  "snapshots": [ ... ]
}

提交表更新(Commit Updates to a Table)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/tables/{table}
用于追加/覆盖、更新 schema、或完成分阶段创建。采用乐观锁:提交时提供的基线元数据位置必须与当前一致。

示例响应:

{
  "metadata-location": "s3://warehouse/analytics/customer_profiles/metadata/00002.metadata.json",
  "table-uuid": "9bde37b3-fc9e-4b32-9b9e-abc123def456"
}

删除表(Drop a Table)

Endpoint: DELETE /v1/{prefix}/namespaces/{namespace}/tables/{table}
默认仅删除 catalog 条目;若同时删除底层数据,添加 purgeRequested=true

可能响应:

  • 204 No Content:删除成功
  • 404 Not Found:表不存在
  • 409 Conflict:无法删除(如被占用或写入冲突)

检查表是否存在(Check Table Existence)

Endpoint: HEAD /v1/{prefix}/namespaces/{namespace}/tables/{table}

  • 204 No Content:存在
  • 404 Not Found:不存在

重命名表(Rename a Table)

Endpoint: POST /v1/{prefix}/rename
支持跨命名空间移动;目标已存在则 409 Conflict。元数据与数据文件本身不变。

可能响应:

  • 200 OK:重命名成功
  • 404 Not Found:源表不存在
  • 409 Conflict:目标表已存在

提交表指标(Submit Metrics for a Table)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics
提交与表相关的性能/访问/运维指标,便于监控与可观测性集成。

可能响应:

  • 200 OK:提交成功
  • 404 Not Found:表不存在
  • 400 Bad Request:指标格式无效

发送表通知(Send Table Notifications)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/tables/{table}/notifications
发送实现相关的轻量通知事件,以触发下游流程、传递生命周期变更或供审计/编排使用。

可能响应:

  • 200 OK:已接受
  • 404 Not Found:表不存在
  • 400 Bad Request:负载不合法或不支持

多表原子提交(Commit Updates to Multiple Tables)

Endpoint: POST /v1/{prefix}/transactions/commit
单个原子事务中提交多个表的元数据更新,确保要么全部成功、要么全部回滚。

错误处理:

  • 409 Conflict:一个或多个更新校验失败(如基线不匹配);整笔事务回滚
  • 400 Bad Request:请求体格式错误
  • 200 OK:全部更新成功并原子提交

视图 API(View API)

用于在某命名空间下管理 SQL 视图。Iceberg 中的视图是具名、版本化的对象,封装 SQL 查询逻辑,可引用表或其他视图,并与物理数据集一同存放在目录中。端点支持列出、创建/替换、查看元数据、删除、检查存在性等操作。视图一经创建即不可变;替换会原子产生新版本,便于安全演进业务逻辑。

列出视图标识(List View Identifiers)

Endpoint: GET /v1/{prefix}/namespaces/{namespace}/views

示例响应:

{
  "identifiers": [
    {"namespace": ["analytics"], "name": "customer_summary"},
    {"namespace": ["analytics"], "name": "daily_revenue"}
  ]
}

错误处理:

  • 404 Not Found:命名空间不存在
  • 200 OK:返回成功

创建视图(Create a View)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/views
提供视图标识与 SQL 定义;若同名已存在,将原子替换

可能响应:

  • 200 OK:创建或替换成功
  • 400 Bad Request:SQL 无效或缺少字段
  • 409 Conflict:版本冲突(如违反不可变约束)

加载视图元数据(Load View Metadata)

Endpoint: GET /v1/{prefix}/namespaces/{namespace}/views/{view}
返回当前版本的 SQL、schema-id 与 version-id 等。

示例响应:

{
  "identifier": {"namespace": ["analytics"], "name": "customer_order_summary"},
  "view-version": {
    "version-id": "v1",
    "schema-id": 1,
    "sql": "SELECT customer_id, COUNT(*) AS orders FROM orders GROUP BY customer_id"
  }
}

错误处理:

  • 404 Not Found:视图不存在
  • 200 OK:返回成功

替换视图(Replace a View)

Endpoint: POST /v1/{prefix}/namespaces/{namespace}/views
与创建相同端点与结构;若已存在则原子替换(建议提供新的 version-id 以跟踪变更)。不存在则创建。

可能响应:

  • 200 OK:替换成功
  • 400 Bad Request:SQL 无效或缺字段
  • 409 Conflict:版本/模式约束冲突

删除视图(Drop a View)

Endpoint: DELETE /v1/{prefix}/namespaces/{namespace}/views/{view}
删除后不可恢复,请谨慎操作。

可能响应:

  • 204 No Content:删除成功
  • 404 Not Found:视图不存在
  • 403 Forbidden:无删除权限

检查视图是否存在(Check View Existence)

Endpoint: HEAD /v1/{prefix}/namespaces/{namespace}/views/{view}

  • 200 OK:存在
  • 404 Not Found:不存在
  • 403 Forbidden:无访问权限

重命名视图(Rename a View)

Endpoint: POST /v1/{prefix}/views/rename
支持跨命名空间移动;目标已存在则冲突。重命名是原子的,保留底层视图版本与 SQL 定义。

可能响应:

  • 200 OK:重命名成功
  • 404 Not Found:源视图不存在
  • 409 Conflict:目标已存在
  • 400 Bad Request:标识或命名空间无效

这些端点为在 Polaris 平台内管理 catalog、namespace、table 与 view 提供了完整而稳健的能力,确保管理员能够灵活而有效地控制与治理资源。

结语(Conclusion)

本章我们概览了 Polaris 的一整套 REST API 端点,它们支持对 catalog、namespace、table、view 等资源的无缝管理,并为开展高级目录操作奠定了基础。现在你已了解 Polaris 的能力以及它如何简化数据治理与访问,接下来就该上手实践了。下一节将通过动手示例展示如何将 Polaris 与 Apache Spark、Dremio、Snowflake 等强大工具集成并配合使用。