一般认为,C++内联汇编 asm 使用 volatile关键字可以避免编译器优化使得 asm volatile 语句被 delete or reorder.
本文给出一个volatile关键字避免asm语句被delete的例子。
CPU上使用GCC编译器的asm volatile验证实验代码来源:
Extended Asm (Using the GNU Compiler Collection (GCC))
The difference between asm, asm volatile and clobbering memory
讨论了
An asm instruction without any output operands will be treated identically to a volatile asm instruction.
没有任何输出操作数的 asm 指令将与易失性 asm 指令相同。
考虑以下实验代码 time.cpp
The following example demonstrates a case where you need to use the
volatilqualifier. It uses the x86rdtscinstruction, which reads the computer’s time-stamp counter. Without thevolatilequalifier, the optimizers might assume that theasmblock will always return the same value and therefore optimize away the second call.下面的示例演示了需要使用
volatile限定符的情况。它使用 x86rdtsc指令,该指令读取计算机的时间戳计数器。如果没有volatile限定符,优化器可能会假设asm块将始终返回相同的值,因此优化了第二次调用。
#include <iostream>
#include <cstdint>
int main() {
uint64_t msr;
// Read Time Stamp Counter (TSC)
asm (
"rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr) // Output operand (result)
: // No input operand
: "rdx" // Clobbered register (rdx)
);
printf("msr: %llx\n", msr);
// Do other work...
// Reprint the timestamp
asm (
"rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr) // Output operand (result)
: // No input operand
: "rdx" // Clobbered register (rdx)
);
printf("msr: %llx\n", msr);
return 0;
}
那么我们先使用上面这个没有 volatile 的版本,在这个代码下,如果GCC不使用编译优化,即O0级别,那么不会对 asm 做优化,两次调用后会输出不同的结果。
[ws@eos test]$ gcc -O0 time.cpp -o time
[ws@eos test]$ ./time
msr: d4fc56d3b7
msr: d4fc5899f5
但如果是O1到O3的优化,都会将第二次asm语句调用优化掉
[ws@eos test]$ gcc -O1 time.cpp -o time
[ws@eos test]$ ./time
msr: 2a234fa0031
msr: 2a234fa0031
[ws@eos test]$ gcc -O3 time.cpp -o time
[ws@eos test]$ ./time
msr: 2a4c2230e30
msr: 2a4c2230e30
而如果采用 asm volatile 的版本
#include <iostream>
#include <cstdint>
int main() {
uint64_t msr;
// Read Time Stamp Counter (TSC)
asm volatile (
"rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr) // Output operand (result)
: // No input operand
: "rdx" // Clobbered register (rdx)
);
printf("msr: %llx\n", msr);
// Do other work...
// Reprint the timestamp
asm volatile (
"rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr) // Output operand (result)
: // No input operand
: "rdx" // Clobbered register (rdx)
);
printf("msr: %llx\n", msr);
return 0;
}
即使是在GCC最激进的O3优化下,第二段 asm volatile 也没有被优化,两次调用输出不一样的结果。
[ws@eos test]$ gcc -O3 time.cpp -o time
[ws@eos test]$ ./time
msr: 2ff9db6dbe3
msr: 2ff9db870eb