海山数据库(He3DB)源码详解：主备复制SyncRepWaitForLSN海山数据库(He3DB)源码详解：主备复制S

海山数据库(He3DB)源码详解：主备复制SyncRepWaitForLSN

背景

He3DB 采用了先进的存储引擎和查询优化技术，能够快速处理大量数据和复杂查询。无论是 OLTP（在线事务处理）还是 OLAP（在线分析处理）场景，都能提供出色的性能表现。He3DB 具备完善的数据备份和恢复机制，能够在系统故障或数据损坏时快速恢复数据，确保业务的连续性。He3DB 支持水平扩展和垂直扩展，可以轻松应对不断增长的数据需求。He3DB 提供了严格的访问控制和数据加密功能，确保数据的安全性和隐私性。

本文基于He3DB，针对主备复制模块进行源码解读分享

流复制——`SyncRepWaitForLSN`

SyncRepWaitForLSN主要用于同步复制中的等待特定预写日志（Write-Ahead Log，WAL）位置的处理。

前期检查与准备 确保在事务提交期间持有中断，防止后续共享内存队列清理受到外部中断影响快速退出条件检查：如果用户未请求同步复制（!SyncRepRequested()）或者没有定义同步复制备用节点名称（!((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined），则直接返回根据提交状态调整同步复制等待模式

void
SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
{
	char	   *new_status = NULL;
	const char *old_status;
	int			mode;

	Assert(InterruptHoldoffCount > 0);

	if (!SyncRepRequested() ||
		!((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
		return;

	/* Cap the level for anything other than commit to remote flush only. */
	if (commit)
		mode = SyncRepWaitMode;
	else
		mode = Min(SyncRepWaitMode, SYNC_REP_WAIT_FLUSH);

	Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
	Assert(WalSndCtl != NULL);

获取同步复制锁与检查 获取同步复制锁（LWLockAcquire(SyncRepLock, LW_EXCLUSIVE)）确保当前进程不在等待状态。再次检查是否需要等待同步复制：如果WalSndCtl->sync_standbys_defined为假或者给定的LSN已经被处理（lsn <= WalSndCtl->lsn[mode]），则释放锁并返回。

//获取同步复制锁
	LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
	//确保当前进程不在等待状态
	Assert(MyProc->syncRepState == SYNC_REP_NOT_WAITING);

	if (!WalSndCtl->sync_standbys_defined ||
		lsn <= WalSndCtl->lsn[mode])
	{
		LWLockRelease(SyncRepLock);
		return;
	}

设置等待状态并加入队列 设置当前进程的等待LSN（MyProc->waitLSN = lsn）和等待状态为正在等待（MyProc->syncRepState = SYNC_REP_WAITING）将当前进程加入同步复制队列（SyncRepQueueInsert(mode)），并确保队列按LSN有序释放同步复制锁

    MyProc->waitLSN = lsn;
	MyProc->syncRepState = SYNC_REP_WAITING;
	SyncRepQueueInsert(mode);
	Assert(SyncRepQueueIsOrderedByLSN(mode));
	LWLockRelease(SyncRepLock);

更新进程标题（可选） 如果需要更新进程标题，则进行相应的操作，显示正在等待同步复制的状态

if (update_process_title)
	{
		int			len;

		old_status = get_ps_display(&len);
		new_status = (char *) palloc(len + 32 + 1);
		memcpy(new_status, old_status, len);
		sprintf(new_status + len, " waiting for %X/%X",
				LSN_FORMAT_ARGS(lsn));
		set_ps_display(new_status);
		new_status[len] = '\0'; /* truncate off " waiting ..." */
	}

循环等待 进入无限循环等待指定的LSN被确认：重置等待锁存器（ResetLatch(MyLatch)）如果当前进程的同步复制状态为已完成（MyProc->syncRepState == SYNC_REP_WAIT_COMPLETE），则跳出循环如果进程有死亡标志（ProcDiePending），则发出警告并取消等待，关闭进一步的输出，准备终止连接如果有查询取消挂起标志（QueryCancelPending），则取消等待并发出警告等待锁存器被设置或主进程死亡（WaitLatch(MyLatch, WL_LATCH_SET | WL_POSTMASTER_DEATH, -1, WAIT_EVENT_SYNC_REP)）-1：通常表示没有超时时间限制，即会一直等待直到满足上述条件之一如果主进程死亡标志被设置（rc & WL_POSTMASTER_DEATH），则设置进程死亡标志，关闭输出，取消等待并跳出循环

or (;;)
	{
		int			rc;

		/* Must reset the latch before testing state. */
		//重置等待锁存器
		ResetLatch(MyLatch);

		/*
		 * Acquiring the lock is not needed, the latch ensures proper
		 * barriers. If it looks like we're done, we must really be done,
		 * because once walsender changes the state to SYNC_REP_WAIT_COMPLETE,
		 * it will never update it again, so we can't be seeing a stale value
		 * in that case.
		 */
		if (MyProc->syncRepState == SYNC_REP_WAIT_COMPLETE)
			break;

		/*
		 * If a wait for synchronous replication is pending, we can neither
		 * acknowledge the commit nor raise ERROR or FATAL.  The latter would
		 * lead the client to believe that the transaction aborted, which is
		 * not true: it's already committed locally. The former is no good
		 * either: the client has requested synchronous replication, and is
		 * entitled to assume that an acknowledged commit is also replicated,
		 * which might not be true. So in this case we issue a WARNING (which
		 * some clients may be able to interpret) and shut off further output.
		 * We do NOT reset ProcDiePending, so that the process will die after
		 * the commit is cleaned up.
		 */
		if (ProcDiePending)
		{
			ereport(WARNING,
					(errcode(ERRCODE_ADMIN_SHUTDOWN),
					 errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
					 errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
			whereToSendOutput = DestNone;
			SyncRepCancelWait();
			break;
		}

		/*
		 * It's unclear what to do if a query cancel interrupt arrives.  We
		 * can't actually abort at this point, but ignoring the interrupt
		 * altogether is not helpful, so we just terminate the wait with a
		 * suitable warning.
		 */
		if (QueryCancelPending)
		{
			QueryCancelPending = false;
			ereport(WARNING,
					(errmsg("canceling wait for synchronous replication due to user request"),
					 errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
			SyncRepCancelWait();
			break;
		}

		/*
		 * Wait on latch.  Any condition that should wake us up will set the
		 * latch, so no need for timeout.
		 */
		//等待锁存器被设置或主进程死亡
		rc = WaitLatch(MyLatch, WL_LATCH_SET | WL_POSTMASTER_DEATH, -1,
					   WAIT_EVENT_SYNC_REP);

		/*
		 * If the postmaster dies, we'll probably never get an acknowledgment,
		 * because all the wal sender processes will exit. So just bail out.
		 */
		if (rc & WL_POSTMASTER_DEATH)
		{
			ProcDiePending = true;
			whereToSendOutput = DestNone;
			SyncRepCancelWait();
			break;
		}
	}

清理状态 当等待结束后，进行状态清理：执行 pg_read_barrier()，确保内存中的读取操作能够正确地看到数据库的一致状态，可能防止读取到尚未稳定的或不一致的数据版本。 Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)))检查与当前进程（由 MyProc 表示）相关的 syncRepLinks 结构是否处于分离状态。如果不满足这个条件，程序可能会停止并报告错误，因为后续的操作假设这个结构已经分离。设置当前进程的同步复制状态为未等待（MyProc->syncRepState = SYNC_REP_NOT_WAITING），并将等待 LSN 重置为 0 如果更新了进程标题，则恢复原始标题并释放内存

    pg_read_barrier();
	Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
	MyProc->syncRepState = SYNC_REP_NOT_WAITING;
	MyProc->waitLSN = 0;

	if (new_status)
	{
		/* Reset ps display */
		set_ps_display(new_status);
		pfree(new_status);
	}
}

He3DB其余文章参考链接

海山数据库(He3DB)源码详解：He3DB-CLOG日志管理器函数之TransactionIdSetTreeStatus

海山数据库(He3DB)+AI（五）：一种基于强化学习的数据库旋钮调优方法

海山数据库(He3DB)+AI（四）：一种基于迁移学习的启发式数据库旋钮调优方法

海山数据库(He3DB)源码解读：海山PG 词法、语法分析

海山数据库(He3DB)源码详解：海山PG 空闲空间映射表FSM

作者介绍

周雨慧中移（苏州）软件技术有限公司数据库内核开发工程师

海山数据库(He3DB)源码详解：主备复制SyncRepWaitForLSN

海山数据库(He3DB)源码详解：主备复制SyncRepWaitForLSN

背景

流复制——SyncRepWaitForLSN

He3DB其余文章参考链接

作者介绍

流复制——`SyncRepWaitForLSN`