学习目标
一、架构设计
1.1 传统拷贝 vs 零拷贝
graph TB
subgraph "传统传输(4次拷贝)"
A1[应用缓冲区] -->|copy| K1[内核缓冲区]
K1 -->|copy| N1[网卡/网络]
N1 -->|copy| K2[对端内核缓冲区]
K2 -->|copy| A2[对端应用缓冲区]
end
subgraph "共享内存零拷贝(0次拷贝)"
B1[Publisher应用] -->|指针传递| SHM[(共享内存段<br/>/dev/shm/)]
SHM -->|指针传递| B2[Subscriber应用]
B1 -.->|mmap| SHM
B2 -.->|mmap| SHM
end
style SHM fill:#c8e6c9
1.2 Fast-DDS SHM 架构
flowchart TB
subgraph "Publisher进程"
PUB[PublisherApp]
DW[DataWriter]
SHM_W[SharedMemTransport<br/>写入端]
SEG_W[(共享内存段<br/>Segment)]
end
subgraph "操作系统内核"
MMAP[mmap/munmap<br/>内存映射]
EVENT[eventfd<br/>事件通知]
SEM[信号量/互斥锁<br/>同步机制]
end
subgraph "Subscriber进程"
SHM_R[SharedMemTransport<br/>读取端]
SEG_R[(共享内存段<br/>Segment)]
DR[DataReader]
SUB[SubscriberApp]
end
PUB --> DW --> SHM_W --> SEG_W
SEG_W -.->|映射| MMAP
SHM_W -.->|通知| EVENT
SHM_R -.->|监听| EVENT
MMAP -.->|映射| SEG_R
SEG_R --> SHM_R --> DR --> SUB
style SEG_W fill:#e3f2fd
style SEG_R fill:#e3f2fd
style EVENT fill:#fff3e0
二、核心机制详解
2.1 共享内存段管理
class SharedMemSegment {
int fd_;
void* base_address_;
size_t segment_size_;
struct SegmentHeader {
uint64_t magic_number;
uint32_t version;
std::atomic<uint32_t> ref_count;
};
public:
bool open(const std::string& name, size_t size) {
fd_ = shm_open(name.c_str(), O_CREAT | O_RDWR, 0666);
ftruncate(fd_, size);
base_address_ = mmap(nullptr, size,
PROT_READ | PROT_WRITE,
MAP_SHARED, fd_, 0);
return base_address_ != MAP_FAILED;
}
void* allocate_buffer(size_t size) {
return offset_to_address(allocate_offset(size));
}
};
2.2 零拷贝数据流
sequenceDiagram
participant P as Publisher
participant SHM as SharedMemManager
participant SEG as /dev/shm/fastdds_xxx
participant EVT as eventfd
participant S as Subscriber
Note over P,S: 初始化阶段
P->>SHM: create_segment("fastdds_pub")
SHM->>SEG: shm_open + mmap
S->>SHM: open_segment("fastdds_pub")
SHM->>SEG: shm_open + mmap (只读或读写)
Note over P,S: 数据传输阶段(零拷贝)
P->>SHM: get_buffer(size)
SHM-->>P: 返回共享内存指针
P->>P: 直接写入共享内存 (序列化)
P->>SHM: notify(data_offset, size)
SHM->>EVT: write(eventfd, 1)
EVT->>S: read(eventfd) 唤醒
S->>SHM: get_data(offset)
SHM-->>S: 返回指针(同一物理地址)
S->>S: 直接读取(反序列化)
S->>SHM: release_buffer()
Note over P,S: 无内核拷贝,无上下文切换
2.3 与 UDP/TCP 对比
| 特性 | UDP | TCP | SHM (零拷贝) |
|---|
| 拷贝次数 | 2-4次 | 2-4次 | 0次 |
| 延迟 | 10-100μs | 1-10ms | 0.5-2μs |
| 吞吐量 | 高 | 中 | 极高 |
| 跨节点 | 支持 | 支持 | 不支持(同主机) |
| 可靠性 | 不可靠 | 可靠 | 可靠(基于共享内存) |
| CPU占用 | 中(内核协议栈) | 高(内核协议栈) | 低(用户态) |
三、代码导读
3.1 关键文件
| 文件 | 类 | 职责 |
|---|
SharedMemTransport.cpp | SharedMemTransport | 传输层实现 |
SharedMemManager.cpp | SharedMemManager | 共享内存管理 |
SharedMemSegment.cpp | SharedMemSegment | 内存段操作 |
SharedMemChannel.cpp | SharedMemChannel | 进程间通信通道 |
3.2 配置启用 SHM
<transport_descriptors>
<transport_descriptor>
<transport_id>shm_transport</transport_id>
<type>SHM</type>
<segment_size>8388608</segment_size>
<port_queue_capacity>1024</port_queue_capacity>
<healthy_check_timeout_ms>1000</healthy_check_timeout_ms>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="shm_participant">
<rtps>
<userTransports>
<transport_id>shm_transport</transport_id>
</userTransports>
<useBuiltinTransports>false</useBuiltinTransports>
</rtps>
</participant>
DomainParticipantQos pqos;
auto shm_transport = std::make_shared<SharedMemTransportDescriptor>();
shm_transport->segment_size(8 * 1024 * 1024);
shm_transport->port_queue_capacity(1024);
pqos.transport().user_transports.push_back(shm_transport);
pqos.transport().use_builtin_transports = false;
auto participant = DomainParticipantFactory::get_instance()->
create_participant(0, pqos);
3.3 零拷贝关键代码路径
bool SharedMemTransport::send(
const fastrtps::rtps::Locator_t& locator,
const fastrtps::rtps::SerializedPayload_t& payload,
const std::vector<GuidPrefix_t>& remote_participants)
{
SharedMemBuffer* buffer = segment_->alloc_buffer(payload.length);
memcpy(buffer->data(), payload.data, payload.length);
for (auto& listener : port_listeners_) {
listener->notify(buffer->offset(), payload.length);
}
return true;
}
bool SharedMemTransport::receive(
fastrtps::rtps::Locator_t& locator,
fastrtps::rtps::SerializedPayload_t& payload,
std::chrono::milliseconds timeout)
{
if (!channel_->wait_notification(timeout)) {
return false;
}
SharedMemBuffer::Offset offset;
channel_->pop_notification(offset);
SharedMemBuffer* buffer = segment_->get_buffer(offset);
payload.data = buffer->data();
payload.length = buffer->size();
return true;
}
四、VSCode 调试 SHM
4.1 观察共享内存段
break SharedMemSegment::open
print name
print segment_size
print fd
shell ls -lh /dev/shm/ | grep fastdds
4.3 性能对比调试
# 分别测试 UDP 和 SHM,观察延迟
# 在 DataWriterImpl::write 入口和出口打时间戳
set $start = (unsigned long long)0
break DataWriterImpl::write
commands
set $start = *(unsigned long long*)(&std::chrono::system_clock::now())
continue
end
break DataWriterImpl::write_return # 如果有返回断点,或用 finish
commands
printf "Latency: %llu ns\n",
*(unsigned long long*)(&std::chrono::system_clock::now()) - $start
continue
end
五、工业级应用与优化
5.1 适用场景
graph TB
A[选择传输方式]
A --> B{同主机?}
B -->|是| C{延迟要求?}
B -->|否| D[UDP/TCP]
C -->|< 10μs| E[SHM 零拷贝]
C -->|> 100μs| F[UDP 即可]
E --> G{数据大小?}
G -->|< 64KB| H[纯SHM]
G -->|> 64KB| I[SHM分段/批量]
style E fill:#c8e6c9
style H fill:#c8e6c9
5.2 优化技巧
| 优化项 | 配置 | 效果 |
|---|
| 大页内存 | segment_size 对齐 2MB | 减少 TLB miss |
| CPU亲和性 | 绑定到同一 NUMA 节点 | 避免跨节点访问 |
| 无锁队列 | 替换 std::mutex | 降低竞争延迟 |
| 预分配 | 增大 port_queue_capacity | 避免运行时分配 |
| 批量传输 | 累积多个样本一次发送 | 摊平通知开销 |
5.3 故障排查
ls -lh /dev/shm/ | grep fastdds
ipcs -m
cat /proc/<pid>/maps | grep /dev/shm
watch -n 1 'ls -lh /dev/shm/ && df -h /dev/shm'
rm /dev/shm/fastdds_*
六、Day 3 自检清单
七、代码流程
sequenceDiagram
participant App as Fast-DDS Application
participant SMTrans as SharedMemTransport
participant SMManager as SharedMemManager
participant SMPort as SharedMemManager::Port
participant SMGlobal as SharedMemGlobal::Port
participant SMChan as SharedMemChannelResource
participant Receiver as TransportReceiverInterface
App ->> SMTrans: init()
SMTrans ->> SMManager: SharedMemManager::create("fastdds")
SMTrans ->> SMManager: create_segment(segment_size,max_alloc)
SMTrans ->> SMManager: open_port(port, queue_size, timeout, Mode::Write)
App ->> SMTrans: OpenInputChannel(locator, receiver)
SMTrans ->> SMManager: open_port(port, ..., Mode::ReadShared/ReadExclusive)
SMManager ->> SMPort: create_listener()
SMTrans ->> SMChan: new SharedMemChannelResource(listener, ...)
App ->> SMTrans: send(buffers, locators)
SMTrans ->> SMTrans: copy_to_shared_buffer(buffers)
SMTrans ->> SMManager: find_port(locator.port)
SMManager ->> SMPort: try_push(shared_buffer)
SMPort ->> SMGlobal: push to ring buffer / notify
SMChan ->> SMManager::Listener: pop() (blocking)
SMListener ->> SMGlobal: wait_pop()
SMGlobal ->> SMListener: BufferDescriptor
SMListener ->> SMManager::SharedMemBuffer: construct
SMChan ->> Receiver: OnDataReceived(data,size,input,remote)
SMChan ->> SMListener: stop_processing_buffer()