mysql的 MVCC

参考链接

MVCC(mutil-verion concurrency control)多版本并发控制。是为了提高数据库并发性能，读不加锁，读写不冲突，设计的，极大增加了系统的并发性。

利用 undolog 和 ReadView 来实现的数据的多版本和隔离性(rc,rr)，并发读不加锁。

mvcc的实现：

msyql数据行的2个（重要的）隐士字段：

DB_TRX_ID：最后一次插入或更新的 tx_id

DB_ROLL_PTR: 指向该行回滚段的指针，通过指针找到之前版本，通过链表形式组织

DB_ROW_ID：隐藏的行ID，用来生成默认的聚餐索引。

db_trx_id的值，只有【insert，update，delete】才会生成事务id，select只会生成的一个假的事务id，事务id的分配时机

undolog （回滚日志）

insert undo log：在 insert 操作中产生的 undo log

update undo log：对 delete 和 update 操作产生的 undo log

多版本

对一行记录的多次修改，就产生多个数据版本，103是最新的数据版本，db_roll_prt记录的 undolog文件中的地址信息。

解决什么问题

1、读写直接的阻塞问题

普通锁：只能串行执行
读写锁：读读并发
mvcc：读写并发执行

2、降低了死锁的概率

mvcc采用乐观锁，读（一致性读）不加锁，写只对影响的行加锁

3、事务的隔离级别（一致性读）

读已提交和可重复读

支持的隔离级别

mvcc只在读已提交和可重复读这个两个隔离级别下工作，其他2个隔离级别则与 mvcc不兼容。

未提交读：总是读取最新的，而不是符合当前事务版本的数据，

串行化：则会对读取的行都加锁,是通过锁来实现的。

READ VIEW机制

在事务的隔离级别（RC，RR）中，不同的隔离级，读到数据是不同的，并发的读写互不影响，那它是怎么实现呢，数据版本的可见性？

ReadView 核心字段

private:
  // Disable copying
  ReadView(const ReadView &);
  ReadView &operator=(const ReadView &);

 private:
  /** The read should not see any transaction with trx id >= this
  value. In other words, this is the "high water mark". */
  trx_id_t m_low_limit_id;

  /** The read should see all trx ids which are strictly
  smaller (<) than this value.  In other words, this is the
  low water mark". */
  trx_id_t m_up_limit_id;

  /** trx id of creating transaction, set to TRX_ID_MAX for free
  views. */
  trx_id_t m_creator_trx_id;

  /** Set of RW transactions that was active when this snapshot
  was taken */
  ids_t m_ids;

m_low_limit_id（高水位）

创建ReadView时，当前系统中，最大的事务id+1（系统应该分配个下一个事务的ID）

数据事务版本（db_trx_id） >= m_low_limit_id 则看不到该记录

m_up_limit_id(低水位)

创建ReadView时，活跃事务id中最的最小的，

数据事务版本（db_trx_id） < m_up_list_id 则可以看到该记录

m_creator_trx_id（当前事务的id）

创建ReadView时的该事务id，如果该事务中，没有【delete,update,update】则该m_creator_trx_id 为0，当有了 事务id则，m_creator_trx_id重新赋值，

ReadView已产生，但是在事务中，自己更新自己，则select可以看到最新的值

void trx_set_rw_mode(trx_t *trx){ /*!< in/out: transaction that is RW */
  
     /* 代码省略*/

  /* So that we can see our own changes. */
  if (MVCC::is_view_active(trx->read_view)) {
    MVCC::set_view_creator_trx_id(trx->read_view, trx->id);
  }
}

void MVCC::set_view_creator_trx_id(ReadView *view, trx_id_t id) {
  ut_ad(id > 0);
  ut_ad(mutex_own(&trx_sys->mutex));

  view->creator_trx_id(id);
}
  

/**
  Set the creator transaction id, existing id must be 0 */
void creator_trx_id(trx_id_t id) {
  ut_ad(m_creator_trx_id == 0);
  m_creator_trx_id = id;
}

m_ids(活跃的事务id)

当前活跃的事务ID形成的集合，

数据版本可见性规则

 /* 检查数据是否可以看到*/

bool changes_visible(trx_id_t id, const table_name_t &name) const
  MY_ATTRIBUTE((warn_unused_result)) {
  ut_ad(id > 0);
  
  // 小于低水位 或者 等于自己则可以看到
  if (id < m_up_limit_id || id == m_creator_trx_id) {
    return (true);
  }

  check_trx_id_sanity(id, name);

  // 大于等于高水位，则看不到
  if (id >= m_low_limit_id) {
    return (false);

  } else if (m_ids.empty()) {
    return (true);
  }

  const ids_t::value_type *p = m_ids.data();
	
  // 再去看看 id 在不在 mids中
  return (!std::binary_search(p, p + m_ids.size(), id));
}

db_trx_id 落在低水位（绿色区间）表示这个版本是已经提交的事务，或者是当前事务自己生成的（自己修改自己则可以看到），则该数据版本可见
db_trx_id 落在高水位（灰色区间）表示这个版本是由将来启动事务生成的，则不可见
db_trx_id 落在中间区域(黄色区域)
- db_trx_id 在 m_ids集合中，则表示这个版本是由，还未提交的事务的生成的，（只是操作了，还没有提交），则不可见
- db_trx_id 不在 m_ids 集合中，则表示这个版本是由已提交事务生成的，则可见， 注意这是说的是RC隔离级别

案例讲解

可重复读（RR）

执行查询（select）sql，生成 ReadView视图，此时 readview字段值如下

m_low_limit_id	m_up_limit_id	m_creator_trx_id	m_ids
202（当前系统最大事务id+1）	201	0	[201]

m_create_trx_id=0是因为此时没有生成事务id

（1）select的执行会查询到的数据分析

获取数据库的最新 db_trx_id=201,在 mids集合中，则表示，该数据版本是是由其他事务生成的还未提交的，则不可见
再根据 db_trx_id=201的 db_roll_prt指针查询上一个数据版本 db_trx_id=101
发现 db_trx_id=101 小于低水位 m_up_limit_id=201 则可见，

（2）db_trx_id=201的事务 commit后，再次执行查询select

因为RR的隔离级别下，ReadView视图的会一直保持到不变到该事务结束
所以再次查询的时候还是只能看到 101,

(3)在当前事务中更新并查询的结果分析

update t_a set name='c1' where id =1;

在当前事务更新，则此时的readView视图产生变化

m_low_limit_id	m_up_limit_id	m_creator_trx_id	m_ids
202（当前系统最大事务id+1）	201	0->202	[201]

执行查询,发现最新的 db_trx_id=202 和 m_creator_trx_id=202 相等，则表示该数据版本是自己生成的则可见。

select * from t_a where id =1;
+----+------+
| id | name | 
+----+------+
|  1 | c1   |
+----+------+

读已提交（RC）

每次执行select都会生成最新的ReadView视图

执行查询（select）sql，生成 ReadView视图，此时 readview字段值如下

m_low_limit_id	m_up_limit_id	m_creator_trx_id	m_ids
202（当前系统最大事务id+1）	201	0	[201]

m_create_trx_id=0是因为此时没有生成事务id

（1）select的执行查询到的数据分析

获取数据库该行的最新数据版本 db_trx_id=201,在mids集合中，则表示，该数据版本是是由其他事务生成的。
再根据 db_trx_id=201的 db_roll_prt指针查询上一个数据版本 db_trx_id=101
发现 db_trx_id=101 小于低水位 m_up_limit_id=201 则可见，

（2）db_trx_id=201的事务 commit后，在此执行查询select

因为RC的隔离级别下，每次select查询都会重新生成 ReadView视图则

m_low_limit_id	m_up_limit_id	m_creator_trx_id	m_ids
202（当前系统最大事务id+1）	0	0	[]

获取数据库该行最新的数据版本 db_trx_id=201,发现不在 m_ids集合中，则表示该事务是由已经提交的事务生成的，则可见

总结

一个数据版本，对于一个事务视图来说，出过自己更新可见外，还可以划分成一下3中情况

1、数据版本未提交不可见

2、数据版本已提交，并且是在创建事务视图之前，则可见

3、数据版本已提交，但是是在创建事务视图之后，则不可见

Q&A

活跃事务id的查询

事务id的生成

show engine innodb status 就可以看到当前的活跃的事务id（只有增，删，改）才会有真正的事务id

TRANSACTIONS
------------
Trx id counter 329762
Purge done for trx's n:o < 329761 undo n:o < 0 state: running but idle
History list length 1
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421663433044664, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421663433043080, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421663433042288, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421663433041496, not started
0 lock struct(s), heap size 1128, 0 row lock(s)


# 下面显示了，当前系统中 活跃的 trx_id事务id
---TRANSACTION 329761, ACTIVE 27 sec
2 lock struct(s), heap size 1128, 1 row lock(s)
MySQL thread id 15, OS thread handle 123145449336832, query id 231 localhost root
---TRANSACTION 329759, ACTIVE 158 sec
2 lock struct(s), heap size 1128, 1 row lock(s)
MySQL thread id 14, OS thread handle 123145453596672, query id 213 localhost root
---TRANSACTION 329758, ACTIVE 329 sec


2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1

MySQL thread id 13, OS thread handle 123145452531712, query id 216 localhost root

mysql的MVCC