{kernel-io}inode_switch_wbs中的smp_mb

195 阅读2分钟

先看pair cgroup_writeback_umount

/**
 * cgroup_writeback_umount - flush inode wb switches for umount
 *
 * This function is called when a super_block is about to be destroyed and
 * flushes in-flight inode wb switches.  An inode wb switch goes through
 * RCU and then workqueue, so the two need to be flushed in order to ensure
 * that all previously scheduled switches are finished.  As wb switches are
 * rare occurrences and synchronize_rcu() can take a while, perform
 * flushing iff wb switches are in flight.
 */
void cgroup_writeback_umount(void)
{
	/*
	 * SB_ACTIVE should be reliably cleared before checking
	 * isw_nr_in_flight, see generic_shutdown_super().
	 */
	smp_mb();

	if (atomic_read(&isw_nr_in_flight)) {
		/*
		 * Use rcu_barrier() to wait for all pending callbacks to
		 * ensure that all in-flight wb switches are in the workqueue.
		 */
		rcu_barrier();
		flush_workqueue(isw_wq);
	}
}

从这个函数注释可以知道触发这条路径的是generic_shutdown_super

继续看 generic_shutdown_super

/**
 *	generic_shutdown_super	-	common helper for ->kill_sb()
 *	@sb: superblock to kill
 *
 *	generic_shutdown_super() does all fs-independent work on superblock
 *	shutdown.  Typical ->kill_sb() should pick all fs-specific objects
 *	that need destruction out of superblock, call generic_shutdown_super()
 *	and release aforementioned objects.  Note: dentries and inodes _are_
 *	taken care of and do not need specific handling.
 *
 *	Upon calling this function, the filesystem may no longer alter or
 *	rearrange the set of dentries belonging to this super_block, nor may it
 *	change the attachments of dentries to inodes.
 */
void generic_shutdown_super(struct super_block *sb)
{
	const struct super_operations *sop = sb->s_op;

	if (sb->s_root) {
		shrink_dcache_for_umount(sb);
		sync_filesystem(sb);
		sb->s_flags &= ~SB_ACTIVE;

		cgroup_writeback_umount();

inode_switch_wbs -> inode_prepare_wbs_switch

inode_switch_wbs {
		atomic_inc(&isw_nr_in_flight);
		inode_prepare_wbs_switch
}



static bool inode_prepare_wbs_switch(struct inode *inode,
				     struct bdi_writeback *new_wb)
{
	/*
	 * Paired with smp_mb() in cgroup_writeback_umount().
	 * isw_nr_in_flight must be increased before checking SB_ACTIVE and
	 * grabbing an inode, otherwise isw_nr_in_flight can be observed as 0
	 * in cgroup_writeback_umount() and the isw_wq will be not flushed.
	 */
	smp_mb();

	if (IS_DAX(inode))
		return false;

	/* while holding I_WB_SWITCH, no one else can update the association */
	spin_lock(&inode->i_lock);
	if (!(inode->i_sb->s_flags & SB_ACTIVE) ||
	    inode->i_state & (I_WB_SWITCH | I_FREEING | I_WILL_FREE) ||
	    inode_to_wb(inode) == new_wb) {
		spin_unlock(&inode->i_lock);
		return false;
	}
	inode->i_state |= I_WB_SWITCH;
	__iget(inode);
	spin_unlock(&inode->i_lock);

	return true;
}
generic_shutdown_super(cpu0)			inode_switch_wbs(pcpu1)
------------------						-----------
sb->s_flags &= ~SB_ACTIVE;(a)			atomic_inc(&isw_nr_in_flight);(c)
smp_mb()								smp_mb()
atomic_read(&isw_nr_in_flight)(b)		inode->i_sb->s_flags & SB_ACTIVE(d)
										__iget(inode);
a是对s_flags的store
b是对flight的load
c是对flight的store
d是对s_flags的load

所以这里必须用smp_mb()

a->b, c->d

期望的是啥?

  1. isw_nr_in_flight must be increased before checking SB_ACTIVE and grabbing an inode otherwise isw_nr_in_flight can be observed as 0 in cgroup_writeback_umount() and the isw_wq will be not flushed.
  2. SB_ACTIVE should be reliably cleared before checking isw_nr_in_flight

cgroup_writeback_umount就是(b)操作,只有 isw_nr_in_flight 不为0才可以flush_workqueue(isw_wq);

所以期望b能看到c! cb

acb或者cab都行

acbd或者cabd 都行 那么d始终能看到a的!