前言
本文比较简短,主要是为了在介绍详细的Hollow之前,首先需要明确下在Hollow中频繁使用的一些关键术语,这些术语可能会比较生僻,也可能比较熟悉,但是对于Hollow来讲都是至关重要的。
@空歌白石 原创。
术语
我将 官方文档 中按照字母排序的术语根据所属的不同模块进行了一定的分类汇总,希望能够帮助大家更好的理解Hollow。
以下文档之所以没有翻译成中文,主要是因为每个术语的解释都很简短并且很容易理解,感觉确实没必要进行翻译。当然如果大家有需求的话,我也可以找时间进行下整理。
| 分类 | 名词 | 解释说明 |
|---|---|---|
| model | data model | A data model defines the structure of a dataset. It is specified with a set of schemas. |
| field | A single value encoded inside of a Hollow record. | |
| hash key | A user-defined specification of one or more fields used to hash elements into a set or entries into a map. | |
| inline | A field for which the value is encoded directly into a record, as opposed to referenced via another record. | |
| namespace (references) | The deliberate creation of a type to hold a specific referenced field's data in order to reduce the cardinality of the referenced records. | |
| primary key | A user-defined specification of one or more fields used to uniquely identify a record within a type. | |
| record | A strongly-typed collection of fields or references, the structure of which is specified by a schema. | |
| reference | A field type which indicates a pointer to another field. Can also refer to the technique of pulling out a specific field into a record type of its own to deliberately allow Hollow to deduplicate the values. | |
| schema | Metadata about a Hollow type which defines the structure of the records. | |
| type | A collection of records all conforming to a specific schema. | |
| data | blob | A blob is a file used by consumers to update their dataset. A blob will be either a snapshot, delta, or reverse delta |
| blob store | A blob store is a file store to which blobs can be published by a producer and retrieved by a consumer. | |
| broken delta chain | When a blob namespace contains a state which is not adjacent to any prior states, the delta chain is said to be broken. In this scenario, consumers may need to load a double snapshot. | |
| deduplication | Two records which have identical data in Hollow will be consolidated into a single record. Any references to duplicate records will be mapped to the canonical one when a dataset is represented with Hollow. | |
| delta | A set of encoded instructions to transition from one data state to an adjacent state. Deltas are encoded as a set of ordinals to remove and a set of ordinals to add, along with the accompanying data to add. 'Delta' may refer specifically to a transition between an earlier state and a later state, contrasted with 'reverse delta', which specifically refers to a transition between a later state and an earlier state. | |
| delta chain | A series of states which are all connected via contiguous deltas. | |
| double snapshot | When a consumer already has an initialized state and an announcement signals to move to a new state for which a path of deltas is not available, the consumer may transition to that state via a snapshot. In this scenario two full copies of the dataset must be loaded in memory. | |
| namespace (blobs) | An addressable, logical separation of both published artifacts in a blob store and announcement location. Used to allow multiple publishers to communicate on separate channels to specific groups of consumers. | |
| patch (states) | Creating a series of two deltas between states in a delta chain. | |
| reverse delta | A delta from a later state to an earlier state. Generally used during pinning scenarios. | |
| snapshot | A blob type which contains a serialization of all of the records in a type. Consumed during initialization, and possibly in a broken delta chain scenario. | |
| producer | producer | A single machine that retrieves all data from a source of truth and produces a delta chain. |
| cycle | A producer runs in an infinite loop. Each exection of the loop is called a cycle. Each cycle produces a single data state. | |
| publish | Writing blobs to a blob store. | |
| restore | Initializing a HollowWriteStateEngine with data from a previously produced state so that a delta may be created during a producer's first cycle. | |
| write state engine | A HollowWriteStateEngine, the root handle to a Hollow dataset as a consumer. | |
| consumer | consumer | One of many machines on which a dataset is made accessible. Consumers are updated in lock-step based on the actions of the producer. |
| read state engine | A HollowReadStateEngine, the root handle to a Hollow dataset as a consumer. | |
| announce | announce | After the blobs for a state have been published to a blob store by a producer, the state must be announced to consumers. The announcement signals to consumers that they should transition to the announced state. |
| pinning | Overriding the state version announcement from the producer, to force clients to go back to or stay at an older state. | |
| state | data state | A dataset changes over time. The timeline for a changing dataset can be broken down into discrete data states, each of which is a complete snapshot of the data at a particular point in time. |
| state | See data state. | |
| adjacent state | If state A is connected via a single delta to state B, then A and B are adjacent to each other. | |
| diff | A comprehensive accounting for the differences between two data states. | |
| ingestion | Gathering data from a source of truth and importing it into Hollow. | |
| state version | A unique identifier for a state. Should by monotonically increasing as time passes. | |
| state engine | Both the producer and consumers handle datasets with a state engine. A state engine can be transitioned between data states. A producer uses a write state engine and a consumer uses a read state engine | |
| memory | object longevity | A technique used to ensure that stale references to Hollow Objects always return the same data they did initially upon creation. Configured via the HollowObjectMemoryConfig. |
| ordinal | An integer value uniquely identifying a record within a type. Because records are represented with a fixed-length number of bits, the only necessary information to locate a record in memory is the record's type and ordinal. Ordinals are automatically assigned by Hollow, and are recycled as records are removed and added. Consequently, they lie in the range of 0-n, where n is generally not much larger than the total number of records for the type. |
结束语
想要将Hollow介绍清楚,需要比较大的篇幅,大家可以通过 Netflix Hollow系列专栏 查看全部已完成的文章。
祝好。