分布式系统--logical clock

1,203 阅读3分钟

问题背景:

在保证事务一致性的操作中,通过将事件按照时间轴排序往往是判断因果关系的重要手段,在single system中,这种操作往往非常准确,因为在无论是单一处理器还是多处理器,使用的都是同一个physical clock,但是到了distributed system中,因为physical clock的差异,往往不太容易用这种时间轴做到右足够准确性的顺序保证。所以需要一种新的"clock", 我们叫它logical clock。

Logical Clocks refer to implementing a protocol on all machines within your distributed system, so that the machines are able to maintain consistent ordering of events within some virtual timespan. This is more formally specified as a way of placing events in some timespan so the following property will always be true:
Given 2 events (e1, e2) where one is caused by the other (e1 contributes to e2 occurring). Then the timestamp of the ‘caused by’ event (e1) is less than the other event (e2).

下边讨论的都是这里定义的logical clock.

scalar clock:(一种最终一致性保证的能力)

翻译过来叫做标量时钟,我理解它是通过本地初始化一个local clock 为0,然后开始处理发送消息和接收消息:

发送消息:

在发送之前执行 local_clock = local_clock + 1,然后把这个local_clock携带在要发送的message中。

接收消息:假设收到的消息中sender的local_clock为remote_clock,则,receiver的local_clock计算方式为: 

local_clock = max(remote_clock, local_clock)

local_clock = local_clock + 1

所以,基本上在全局中都能保持一个正确的时间轴。如图-2


                                                                   图-2

但是,因为毕竟还是存在一个矫正的过程,就是receive message的时候需要进行max()操作来把receiver自己的local_clock和remote_clock进行一次同步操作,但是,我们知道,这个消息一般是从网络中传递的,那就会有延迟,甚至无法送达的情况,所以会在这种消息还未同步过去的时候可能会出现节点间的时间不一致的情况,从时间轴上看到的可能会出现状态不一致的情况,所以它保证的是一个最终一致性。

vector clock:(强一致性保证)

翻译为矢量时钟,定义如下:

Vector Clocks expand upon Scalar Time to provide a strongly consistent view of the world. With this approach, each process keeps a vector (a list of integers) with an integer for each local clock of every process within the system. If there are N processes, there will be a vector of N size maintained by each process. Given a process (Pi) with a vector (v), Vector Clocks implement the Logical Clock rules as follows:

Rule 1:
before executing an event (excluding the event of receiving a message) process Pi increments the value v[i] within its local vector by 1. This is the element in the vector that refers to Processor(i)’s local clock.

local_vector[i] = local_vector[i] + 1

Rule 2:
when receiving a message (the message must include the senders vector) loop through each element in the vector sent and compare it to the local vector, updating the local vector to be the maximum of each element. Then increment your local clock within the vector by 1 [Figure 5].

1. for k = 1 to N: local_vector[k] = max(local_vector[k], sent_vector[k])

2. local_vector[i] = local_vector[i] + 1

3. message becomes available.

什么意思呢,就是有多少个并发处理的任务就维护多少个clock.并且,每个处理节点无论是接收消息还是发送消息,都由自己进行递增,在发送消息的时候顺便更新下自己维护的整个clock,整个如果出现消息没有接收到,那么事件在自己local_clock[i]上的时间轴总是准确的,所以消息也就是准确的,不过,依然需要消息的FIFO保证,否则累加有问题。效果如图-3:


                                                          图-3


参考:

vector clock:

lamport clock: