Filecoin Spec 翻译 ——【2.2】文件 & 数据（上）星际联盟提供对Filecoin Spec 的全文翻

原文链接：filecoin-project.github.io/specs/#syst…

本文由星际联盟Boni Liu翻译，转载请注明出处。

星际联盟提供对Filecoin Spec 的全文翻译，便于Filecoin项目广大中国参与者理解Filecoin的深层原理。文章将定期更新章节，请持续关注"IPFS星际联盟"&"星际联盟Filecoin"公众号。

2.2 文件 & 数据

Filecoin的主要目的是存储客户的文件和数据。本节将详细介绍与处理文件、分块、编码、图形表示、Pieces(分片)、存储抽象等相关的数据结构和工具。

2.2.1 文件

// Path is an opaque locator for a file (e.g. in a unix-style filesystem).
type Path string

// File is a variable length data container.
// The File interface is modeled after a unix-style file, but abstracts the
// underlying storage system.
type File interface {
    Path()   Path
    Size()   int
    Close()  error

    // Read reads from File into buf, starting at offset, and for size bytes.
    Read(offset int, size int, buf Bytes) struct {size int, e error}

    // Write writes from buf into File, starting at offset, and for size bytes.
    Write(offset int, size int, buf Bytes) struct {size int, e error}
}

2.2.1.1 FileStore - 文件的本地存储

FileStore是一种抽象概念，用来指代Filecoin会存入数据的任何底层系统或设备。它基于Unix文件系统的语义，又包含Paths(路径) 的概念。在这里使用这种抽象是为了确保Filecoin的实现，使终端用户可以轻松地使用符合他们需求的存储系统来替换底层存储系统。FileStore最简单的版本是只有主机操作系统的文件系统。

// FileStore is an object that can store and retrieve files by path.
type FileStore struct {
    Open(p Path)           union {f File, e error}
    Create(p Path)         union {f File, e error}
    Store(p Path, f File)  error
    Delete(p Path)         error

    // maybe add:
    // Copy(SrcPath, DstPath)
}

2.2.1.1.1 不同的用户需求

Filecoin用户的需求差异很大，许多用户（尤其是矿工）将依附和围绕Filecoin来实现复杂的存储架构。FileStore这种抽象化存在的目的，就是便于满足不同的需求。 Filecoin协议中对所有文件和扇区本地数据的存储，都是通过这个FileStore接口定义的，这使得可交换变得易于实现，并且终端用户可以轻松选择所需要的系统。

2.2.1.1.2 Implementation(实现) 示例

FileStore接口可以通过多种支持备份数据的存储系统来实现。例如：

主机操作系统文件系统
所有 Unix / Posix 文件系统
RAID-backed 文件系统
联网的分布式文件系统（NFS，HDFS等）
IPFS
数据库
NAS 系统
原始串行或区块设备
原始硬盘驱动器（hdd扇区等）

Implementation应当实现对主机OS文件系统的支持。Implementation能够实现对其他存储系统的支持。

2.2.2 Piece(分片) - 文件的一部分

Piece(分片) 代表File(文件) 的整体或一部分，在Deals(交易) 中被Clients(客户) 和Miners(矿工) 使用。客户需雇用矿工来存储分片。

分片结构的设计用于证明任意IPLD图表和客户端数据的存储。下图显示了分片的详细组成及其证明树，包括完整的和带宽最大化的分片结构。

分片，证明树，分片的数据结构（在新标签页中打开）

import abi "github.com/filecoin-project/specs-actors/actors/abi"

// PieceInfo is an object that describes details about a piece, and allows
// decoupling storage of this information from the piece itself.
type PieceInfo struct {
    ID    PieceID
    Size  abi.PieceSize
    // TODO: store which algorithms were used to construct this piece.
}

// Piece represents the basic unit of tradeable data in Filecoin. Clients
// break files and data up into Pieces, maybe apply some transformations,
// and then hire Miners to store the Pieces.
//
// The kinds of transformations that may ocurr include erasure coding,
// encryption, and more.
//
// Note: pieces are well formed.
type Piece struct {
    Info       PieceInfo

    // tree is the internal representation of Piece. It is a tree
    // formed according to a sequence of algorithms, which make the
    // piece able to be verified.
    tree       PieceTree

    // Payload is the user's data.
    Payload()  Bytes

    // Data returns the serialized representation of the Piece.
    // It includes the payload data, and intermediate tree objects,
    // formed according to relevant storage algorithms.
    Data()     Bytes
}

// // LocalPieceRef is an object used to refer to pieces in local storage.
// // This is used by subsystems to store and locate pieces.
// type LocalPieceRef struct {
//   ID   PieceID
//   Path file.Path
// }

// PieceTree is a data structure used to form pieces. The algorithms involved
// in the storage proofs determine the shape of PieceTree and how it must be
// constructed.
//
// Usually, a node in PieceTree will include either Children or Data, but not
// both.
//
// TODO: move this into filproofs -- use a tree from there, as that's where
// the algorightms are defined. Or keep this as an interface, met by others.
type PieceTree struct {
    Children  [PieceTree]
    Data      Bytes
}

2.2.2.1 PieceStore - 存储与索引分片

PieceStore可以从一些本地存储中存储和检索分片。 PieceStore还会额外提供对分片的索引。

import ipld "github.com/filecoin-project/specs/libraries/ipld"

type PieceID UVarint

// PieceStore is an object that stores pieces into some local storage.
// it is internally backed by an IpldStore.
type PieceStore struct {
    Store              ipld.GraphStore
    Index              {PieceID: Piece}

    Get(i PieceID)     struct {p Piece, e error}
    Put(p Piece)       error
    Delete(i PieceID)  error
}

2.2.3 Filecoin中的数据传输

数据传输是一种系统，用于交易产生时通过网络传输完整或部分的分片。

2.2.3.1 模块

下图展示了Data Transfer(数据传输) 和它的模块如何适应存储和检索市场。需要特别关注的是，来自市场的Data Transfer Request Validators(数据传输请求验证器) ，是如何插入到Data Transfer(数据传输) 模块中，又能保证他们的代码是属于市场系统的。

数据传输 - 推送流程（在新标签中打开）

2.2.3.2 术语

Push Request：发出向另一方发送数据的请求
Pull Request：发出请求另一方发送数据的请求
Requestor：发起数据传输请求的一方（无论是代码的推送还是合并）
Responder：接收数据传输请求的一方
Data Transfer Voucher：围绕存储或检索数据的一种包装，可以识别和验证对另一方的传输请求
Request Validator：仅当响应者可以验证请求是否直接与现有存储协议或检索协议绑定时，数据传输模块才启动传输。验证不由数据传输模块本身执行，取而代之的是，会有一个请求验证器来检查数据传输凭单，以确定是否响应请求。
Scheduler：当请求完成协商并被验证后，实际的传输将由双方的Scheduler(调度程序) 管理。 Scheduler(调度程序) 是数据传输模块的一部分，但它与协商过程隔离。它可以访问底层的可验证传输协议，并使用它来发送数据与跟踪进度。
Subscriber：一个外部组件，可通过订阅数据传输事件来监控数据传输的进度 (如：进行中，已完成) 。
GraphSync：Scheduler(调度程序) 使用的默认底层传输协议。完整的graphsync细则详见： github.com/ipld/specs/…

2.2.3.3 请求阶段

所有的数据传输都有两个基本阶段：

协商 - 请求者和响应者通过使用数据传输凭证验证来同意传输。
传输 - 实际上，一旦协商阶段完成，数据就已经被传输。用于传输的默认协议为Graphsync。

请注意，“协商”和“转移”阶段可以在单独的往返过程中进行，也可能在相同的往返过程中进行。在此过程中，请求方通过发送请求隐式地表示同意，而响应方可以直接同意并立即发送或接收数据。

2.2.3.4 流程示例

2.2.3.4.1 推送流程

数据传输 - 推送流程图（在新标签中打开）

当请求者想要将数据发送给另一方时，会发起一个Push(推送) 传输
请求者的数据传输模块将把推送请求与数据传输凭证一起发送给响应者，并会将数据传输放入调度程序队列中。这意味着，请求者期望响应者在请求被验证后就立即开始传输。
响应者的数据传输模块通过Validator(验证器) 验证数据传输请求，该验证器作为附属品，由响应者提供
响应者的数据传输模块调度安排传输
响应者发送GraphSync请求来请求数据
请求者接收到Graphsync请求，验证它是否在调度程序中，若存在，则开始发送数据
响应者接收数据并可以产生进度指示
响应者完成接收数据，并通知所有listener(监听者)

推送流程是存储交易的理想选择，客户一旦确认交易已签署并处于链上，就会启动推送

2.2.3.4.2 Pull(合并)流程

数据传输 - 合并流程（在新标签页中打开）

当请求者想要将数据发送给另一方时，会发起一个Push(推送) 传输
请求者的数据传输模块将把推送请求与数据传输凭证一起发送给响应者，并会将数据传输放入调度程序队列中。
响应者的数据传输模块通过PullValidator(合并验证器) 验证数据传输请求，该验证器作为附属品，由响应者提供
响应者的数据传输模块调度安排传输（这意味着它期望请求者启动实际的传输）
响应者的数据传输模块向请求者发送响应信息，表示它已接受到传输请求，正在等待请求者启动传输。
请求者将调度安排数据传输
请求者发送GraphSync请求来请求数据
响应者接收到Graphsync请求，验证它是否在调度程序中，若存在，则开始发送数据
请求者接收数据并可以产生进度指示
请求者完成接收数据，并通知所有listeners(监听者)

合并流程是检索交易的理想选择，一旦交易达成，客户将启动合并。

Filecoin Spec 翻译 ——【2.2】 文件 & 数据（上）