文末附有全部实现代码
背景
写C/C++代码的时候我们经常碰到一些内存问题,这类问题具有偶现、不稳定的特点,往往需要借助内存检查工具去定位。如今内存检查工具比较成熟的有valgrind、windows平台的CRT等,尽管如此,在工作中,我还是需要遇到了需要手动实现一个C内存检查工具的场景:
- 内网环境管控严格,不能随便引入外部工具。
- 现有工具不够好用,不能仅针对某个模块去做内存检查。有时候别的模块造成的问题不会在第一时间被触发,最后问题往往集中体现在运算量比较大的模块里,导致分析效率低下。
- 现有工具往往是全量检查,对所有内存操作都做了检查,运行较慢。
先讲原理
- 实现增强版的内存申请、释放函数,对内存块进行封装,记录内存申请/释放的文件名、行数,用于后续的检查报告打印。
- 增强版内存函数在申请内存时,在目标内存的头尾额外申请一段内存,并在头尾部放入魔法数,后续通过检查魔法数是否被修改来检测内存越界。
- 增强版内存函数在释放内存时,记录内存块被释放的次数,释放前检查这个计数器即可检测出
double free的问题。在程序结束前检查这个计数器,即可检查出内存泄漏问题。 - 用链表将所有申请的内存块存储起来,mallocList存储未释放的内存,freedList存储已释放的内存,便于进行全量内存检查。
- 将增强类实现为线程级单例,用宏定义替换掉原生的内存申请接口,便于在项目中使用。
- 再考虑靠后的情况下。可以考虑直接封装内存读写操作,并在读写前检查是否越界。
一步步实现
魔法数检测内存越界
对于内
- 分别定义两个用于填充头尾的魔法数。
- 实现一个增强的内存分配函数,在申请内存大小
size的基础上,多申请8个字节的内存,头尾各分配4个字节用于存放魔法数。所以增强函数实际申请了size + 8个字节的内存。 - 由于头尾的魔法数标记各占用了4个字节,所以我们将
ptr + 4作为申请成功的内存地址返回。 - 实现检查函数:检查每块已分配内存的头尾部的魔法数标记是否被篡改,即可检查出是否发生内存越界修改。
原理如下图:
核心代码如下,这里我用宏简化了函数,后续直接用EnhancedMalloc替换malloc即可:
#define MAGIC_HEAD_NUMBER 0xDEADBEEF
#define MAGIC_TAIL_NUMBER 0xD8675309
#define EnhancedMalloc(size) EnhancedMemoryManager::getInstance().enhancedMalloc(size, __FILE__, __LINE__)
void *EnhancedMemoryManager::enhancedMalloc(size_t size, const char* file_name, int line_number) {
void* data = malloc(size + 8);
memset(data, 0, size + 8);
if (data == nullptr) {
printf("Alloc failed.");
return nullptr;
}
// 分配魔法数
unsigned char* magic_ptr = (unsigned char*)data;
magic_ptr[0] = (unsigned char)MAGIC_HEAD_NUMBER;
magic_ptr[1] = (unsigned char)(MAGIC_HEAD_NUMBER >> 8);
magic_ptr[2] = (unsigned char)(MAGIC_HEAD_NUMBER >> 16);
magic_ptr[3] = (unsigned char)(MAGIC_HEAD_NUMBER >> 24);
magic_ptr[size + 4] = (unsigned char)MAGIC_TAIL_NUMBER;
magic_ptr[size + 4 + 1] = (unsigned char)(MAGIC_TAIL_NUMBER >> 8);
magic_ptr[size + 4 + 2] = (unsigned char)(MAGIC_TAIL_NUMBER >> 16);
magic_ptr[size + 4 + 3] = (unsigned char)(MAGIC_TAIL_NUMBER >> 24);
MemoryBlock *block = new MemoryBlock(file_name, line_number);
block->size = size;
block->data = data;
this->mallocList.push_back(block);
return (void*) &magic_ptr[4];
}
freed计数器检测内存释放情况
对于重复释放内存错误,我通过计数的方式检测,其原理非常简单。
1大部在M泄露oryManager中把每块内存都用链表存起来,同时记录他们被释放的次数。
- 实现增强的内存释放函数,在释放前找到这块内存的记录。检查它的释放计数器是否大于等于1.
- 是,则说明本次释放为重复释放,直接打印警告日志,不做释放操作。否则,正常释放,计数器+1.
- 程序结束时,可以通过检查计数器,检测是否存在内存泄漏。
核心代码如下:
#define EnhancedFree(ptr) EnhancedMemoryManager::getInstance().enhancedFree(ptr)
void EnhancedMemoryManager::enhancedFree(void * ptr) {
auto iterator = std::find_if(this->mallocList.begin(), this->mallocList.end(), [ptr](const MemoryBlock *block) {
return block->data == ((unsigned char *) ptr - 4);
});
if (iterator == this->mallocList.end()) {
iterator = std::find_if(this->freedList.begin(), this->freedList.end(), [ptr](const MemoryBlock *block) {
return block->data == ((unsigned char*)ptr - 4);
});
}
if (iterator == this->mallocList.end()) {
return;
}
MemoryBlock *targetBlock = *iterator;
// 标记为freed
targetBlock->freed += 1;
if (targetBlock->freed > 1) {
printf("%sDouble free detected.\n", targetBlock->getAllocPos());
return;
}
// 释放前检查是否存在内存越界。
check(targetBlock);
this->mallocList.remove(targetBlock);
this->freedList.push_back(targetBlock);
free(targetBlock->data);
}
用链表存储内存块,便于全量检查
这块实现比较简单暴力,我直接用STL容器做的存储,就不做过多赘述了,直接上核心代码:
class EnhancedMemoryManager {
private:
// 屏蔽构造方法
EnhancedMemoryManager() {};
EnhancedMemoryManager(const EnhancedMemoryManager&) = delete;
EnhancedMemoryManager& operator = (const MemoryBlock&) = delete;
public:
std::list<MemoryBlock*> mallocList; // 存储未释放的内存块
std::list<MemoryBlock*> freedList; // 存储已释放的内存块
void* enhancedMalloc(size_t, const char*, int);
void enhancedFree(void*);
void check(MemoryBlock*);
void checkAll();
void clear();
static EnhancedMemoryManager& getInstance() {
// 线程私有单例
static thread_local EnhancedMemoryManager enhancedMemoryManager;
return enhancedMemoryManager;
}
void traverse();
};
void EnhancedMemoryManager::checkAll() {
if (this->mallocList.empty() && this->freedList.empty()) {
printf("Memory OK: No malloc or free ever been called.");
return;
}
for (std::list<MemoryBlock*>::iterator it = mallocList.begin(); it != mallocList.end() ; ++it) {
check(*it);
}
for (std::list<MemoryBlock*>::iterator it = freedList.begin(); it != freedList.end() ; ++it) {
check(*it);
}
}
封装内存访问,直接防止越界
通过魔法数检查内存泄漏的方式代价较小,但是当内存越界的步长超过头尾部魔法数的范围时,有可能存在漏检查。
所以如果怀疑存在“大步长的内存泄漏”,还有一种代价比较大的方法,那就是记录下每一块内存的大小,通过函数访问这块内存,并在函数中直接判断是否越界。这种方法适合实在没办法的情况下使用。
由于我的是C++项目,我按照以上思路简单做了一种C++的实现,不依赖MemoryManager。核心代码如下:
template<typename T>
class SimBuffer {
public:
int size{};
T* data;
char const* file_name{};
int line_number = 0;
SimBuffer(int size, const char* file_name, int line_number);
// 判断是否越界
T& operator[](int index);
~SimBuffer();
};
template<typename T>
SimBuffer<T>::SimBuffer(int size, const char* file_name, int line_number) {
this->file_name = file_name;
this->line_number = line_number;
this->data = (T*)EnhancedMemoryManager::getInstance().enhancedMalloc(sizeof(T) * size, file_name, line_number);
this->size = size;
}
template<typename T>
T& SimBuffer<T>::operator[](int index) {
printf("DEBUG: using override [].\n");
if (index < 0 || index >= this->size) {
fprintf(stderr, "[Allocated at %s:%d]index out of bound, index=%d, size=%d.\n", this->file_name, this->line_number, index, this->size);
throw std::runtime_error("i am an exception");
}
return data[index];
}
template<typename T>
SimBuffer<T>::~SimBuffer() {
printf("DEBUG: releasing %s:%d.\n", this->file_name, this->line_number);
EnhancedMemoryManager::getInstance().enhancedFree(this->data);
file_name = nullptr;
}
测试代码:
int main(int argc, char *argv[]) {
SimBuffer<unsigned short> *simP5, *simP6 = nullptr;
simP5 = new SimBuffer<unsigned short>(10, __FILE__, __LINE__);
simP6 = new SimBuffer<unsigned short>(10, __FILE__, __LINE__);
(*simP5)[10] = 10;
(*simP6)[-1] = 9;
delete simP5;
delete simP6;
}
// 输出如下:
// [Allocated at E:\repo\CPlayground\Main.cpp:203]index out of bound, index=10, size=10.
// terminate called after throwing an instance of 'std::runtime_error'
对于C项目,可以采用宏定义的方式重载运算符。
#include <stdio.h>
#include <assert.h>
// 定义一个结构体来模拟数组
typedef struct {
int *array;
size_t size;
} Array;
// 定义一个宏来模拟数组的索引操作
#define ARRAY_INDEX(arr, index) (assert((index) < (arr).size), (arr).array[(index)])
int main() {
int data[] = {10, 20, 30, 40, 50};
Array arr = {data, sizeof(data) / sizeof(data[0])};
// 使用宏来访问数组元素
printf("Element at index 2: %d\n", ARRAY_INDEX(arr, 10));
// 尝试访问越界元素,将会触发断言
// printf("Element at index 10: %d\n", ARRAY_INDEX(arr, 10));
return 0;
}
但是这两种方法都需要对现有的项目“大改”,不太友好。不过暂时也没想到更好的办法,如果读者有更好的想法,随时欢迎交流。
总结
在本文,我实现了一个简易的C/C++内存检查工具EnhancedMemoryManager,参考的是Windows CRT库内存检查的核心思路。
- 它能够检测出内存写越界、重复释放、内存泄漏这三类内存问题,并且详细报出哪个文件哪一行申请的内存有问题。
- 由于魔法数检查的方式不能检测出读越界,同时存在漏报的问题,又提出了一种代价较大的直接检测越界的方法,
EnhancedMemoryManager可以和这种检测方法搭配使用。 - 直接检测越界的方法入侵性很强,暂时没想到更好的办法~
参考文章
全部实现代码
头文件
#ifndef CPLAYGROUND_MEMORYMANAGER_H
#define CPLAYGROUND_MEMORYMANAGER_H
#include <stddef.h>
#include <thread>
#include <string>
#include <sstream>
#include <iostream>
#include <list>
#define EnhancedMalloc(size) EnhancedMemoryManager::getInstance().enhancedMalloc(size, __FILE__, __LINE__)
#define EnhancedFree(ptr) EnhancedMemoryManager::getInstance().enhancedFree(ptr)
class MemoryBlock {
public:
char const* file_name;
int line_number = 0;
int freed = 0;
size_t size;
// 1. 前后插入魔法数
// 2. 释放时realData全释放,记录freed。
// 3. 只有clearAlloc时,才会释放所有MemoryBlock
void* data;
MemoryBlock(char const*, int);
~MemoryBlock();
const char* getAllocPos();
bool operator == (const MemoryBlock &other) const;
bool operator == (const void* ptr) const;
};
class EnhancedMemoryManager {
private:
// 屏蔽构造方法
EnhancedMemoryManager() {};
EnhancedMemoryManager(const EnhancedMemoryManager&) = delete;
EnhancedMemoryManager& operator = (const MemoryBlock&) = delete;
public:
std::list<MemoryBlock*> mallocList;
std::list<MemoryBlock*> freedList;
void* enhancedMalloc(size_t, const char*, int);
void enhancedFree(void*);
void check(MemoryBlock*);
void checkAll();
void clear();
static EnhancedMemoryManager& getInstance() {
// 线程私有单例
static thread_local EnhancedMemoryManager enhancedMemoryManager;
return enhancedMemoryManager;
}
void traverse();
};
template<typename T>
class SimBuffer {
public:
int size{};
T* data;
char const* file_name{};
int line_number = 0;
SimBuffer(int size, const char* file_name, int line_number);
T& operator[](int index);
~SimBuffer();
};
#endif //CPLAYGROUND_MEMORYMANAGER_H
cpp文件
#include "MemoryManager.h"
#include <iostream>
#include <sstream>
#include <malloc.h>
#ifdef __linux__
#include <execinfo.h>
#endif
#include <stdio.h>
#include <memory.h>
#include <algorithm>
#define MAGIC_HEAD_NUMBER 0xDEADBEEF
#define MAGIC_TAIL_NUMBER 0xD8675309
#ifdef __linux__
void print_stacktrace() {
const int max_frames = 128;
void* frame[max_frames];
int frame_count = backtrace(frame, max_frames);
char** symbols = backtrace_symbols(frame, frame_count);
if (symbols) {
for (int i = 0; i < frame_count; ++i) {
printf("%s\n", symbols[i]);
}
for (int i = 1; i < frame_count; ++i)
{
printf("[bt] #%d %s\n", i, symbols[i]);
/* find first occurence of '(' or ' ' in message[i] and assume
* everything before that is the file name. (Don't go beyond 0 though
* (string terminator)*/
size_t p = 0;
while(symbols[i][p] != '(' && symbols[i][p] != ' '
&& symbols[i][p] != 0)
++p;
char syscom[256];
sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, symbols[i]);
//last parameter is the file name of the symbol
system(syscom);
}
free(symbols);
}
}
#endif
const char* MemoryBlock::getAllocPos() {
std::stringstream ss;
ss << "[Allocated at " << this->file_name << ":" << this->line_number << "] ";
std::string result = ss.str();
char *str = new char[result.size() + 1];
std::copy(result.begin(), result.end(), str);
str[result.size()] = '\0';
// print_stacktrace();
return str;
}
MemoryBlock::MemoryBlock(const char * file_name, int line_number) {
this->file_name = file_name;
this->line_number = line_number;
}
MemoryBlock::~MemoryBlock() {
printf("~MemoryBlock been called.\n");
if (this->freed < 1) {
EnhancedMemoryManager::getInstance().enhancedFree(this->data);
}
}
bool MemoryBlock::operator==(const MemoryBlock &other) const {
if (this->data == other.data) {
return true;
}
return false;
}
bool MemoryBlock::operator==(const void *ptr) const {
if (this->data == ptr) {
return true;
}
return false;
}
void EnhancedMemoryManager::check(MemoryBlock *block) {
if (block->freed > 1) {
printf("%sDouble free detected.\n", block->getAllocPos());
// print_stacktrace();
return;
}
// 检查魔法数
unsigned char* magic_ptr = (unsigned char*)(block->data);
bool beforeFlag = false, afterFlag = false;
for (int i = 0; i < 3; ++i) {
if (!beforeFlag) {
unsigned char tmpChar = MAGIC_HEAD_NUMBER >> 8 * i;
if (magic_ptr[i] != tmpChar) {
beforeFlag = true;
}
}
if (!afterFlag) {
unsigned char tmpChar = MAGIC_TAIL_NUMBER >> 8 * i;
if (magic_ptr[block->size + 4 + i] != tmpChar) {
afterFlag = true;
}
}
}
if (beforeFlag) {
printf("%sMemory clobbered before allocated block.\n", block->getAllocPos());
}
if (afterFlag) {
printf("%sMemory clobbered after allocated block.\n", block->getAllocPos());
}
}
void *EnhancedMemoryManager::enhancedMalloc(size_t size, const char* file_name, int line_number) {
void* data = malloc(size + 8);
memset(data, 0, size + 8);
if (data == nullptr) {
printf("Alloc failed.");
return nullptr;
}
// 分配魔法数
unsigned char* magic_ptr = (unsigned char*)data;
magic_ptr[0] = (unsigned char)MAGIC_HEAD_NUMBER;
magic_ptr[1] = (unsigned char)(MAGIC_HEAD_NUMBER >> 8);
magic_ptr[2] = (unsigned char)(MAGIC_HEAD_NUMBER >> 16);
magic_ptr[3] = (unsigned char)(MAGIC_HEAD_NUMBER >> 24);
magic_ptr[size + 4] = (unsigned char)MAGIC_TAIL_NUMBER;
magic_ptr[size + 4 + 1] = (unsigned char)(MAGIC_TAIL_NUMBER >> 8);
magic_ptr[size + 4 + 2] = (unsigned char)(MAGIC_TAIL_NUMBER >> 16);
magic_ptr[size + 4 + 3] = (unsigned char)(MAGIC_TAIL_NUMBER >> 24);
MemoryBlock *block = new MemoryBlock(file_name, line_number);
block->size = size;
block->data = data;
this->mallocList.push_back(block);
return (void*) &magic_ptr[4];
}
void EnhancedMemoryManager::enhancedFree(void * ptr) {
auto iterator = std::find_if(this->mallocList.begin(), this->mallocList.end(), [ptr](const MemoryBlock *block) {
return block->data == ((unsigned char *) ptr - 4);
});
if (iterator == this->mallocList.end()) {
iterator = std::find_if(this->freedList.begin(), this->freedList.end(), [ptr](const MemoryBlock *block) {
return block->data == ((unsigned char*)ptr - 4);
});
}
if (iterator == this->mallocList.end()) {
return;
}
MemoryBlock *targetBlock = *iterator;
// 标记为freed
targetBlock->freed += 1;
if (targetBlock->freed > 1) {
printf("%sDouble free detected.\n", targetBlock->getAllocPos());
return;
}
// 释放前检查是否存在内存越界。
check(targetBlock);
this->mallocList.remove(targetBlock);
this->freedList.push_back(targetBlock);
free(targetBlock->data);
}
void EnhancedMemoryManager::checkAll() {
if (this->mallocList.empty() && this->freedList.empty()) {
printf("Memory OK: No malloc or free ever been called.");
return;
}
for (std::list<MemoryBlock*>::iterator it = mallocList.begin(); it != mallocList.end() ; ++it) {
check(*it);
}
for (std::list<MemoryBlock*>::iterator it = freedList.begin(); it != freedList.end() ; ++it) {
check(*it);
}
}
void EnhancedMemoryManager::clear() {
this->mallocList.clear();
this->freedList.clear();
}
void EnhancedMemoryManager::traverse() {
printf("=> Allocations have NOT been freed.\n");
for (auto it = this->mallocList.begin(); it != this->mallocList.end() ; ++it) {
printf("%s \n", (*it)->getAllocPos());
}
printf("=> Allocations have been FREED.\n");
for (auto it = this->freedList.begin(); it != this->freedList.end() ; ++it) {
printf("%s \n", (*it)->getAllocPos());
}
}
template<typename T>
SimBuffer<T>::SimBuffer(int size, const char* file_name, int line_number) {
this->file_name = file_name;
this->line_number = line_number;
this->data = (T*)EnhancedMemoryManager::getInstance().enhancedMalloc(sizeof(T) * size, file_name, line_number);
this->size = size;
}
template<typename T>
T& SimBuffer<T>::operator[](int index) {
printf("DEBUG: using override [].\n");
if (index < 0 || index >= this->size) {
fprintf(stderr, "[Allocated at %s:%d]index out of bound, index=%d, size=%d.\n", this->file_name, this->line_number, index, this->size);
throw std::runtime_error("i am an exception");
}
return data[index];
}
template<typename T>
SimBuffer<T>::~SimBuffer() {
printf("DEBUG: releasing %s:%d.\n", this->file_name, this->line_number);
EnhancedMemoryManager::getInstance().enhancedFree(this->data);
file_name = nullptr;
}
template class SimBuffer<unsigned short>;
测试代码
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <thread>
#include "MemoryManager.h"
#include "hook_cxa_throw-lys.h"
#define VARIABLE_NAME(var) #var
#define IS_IN(lib) (IN_MODULE == MODULE_##lib)
int main(int argc, char *argv[]) {
TestStruct *p1, *p2, *tmpp = NULL;
setvbuf(stdout, NULL, _IONBF, 0);
EnhancedMemoryManager &memoryManager = EnhancedMemoryManager::getInstance();
p1 = (TestStruct*)EnhancedMalloc(sizeof(TestStruct) * 10);
// p1 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
p2 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
tmpp = p1 + 10;
tmpp[0].id = 10; // 内存块后越界
tmpp = p2 - 1;
tmpp[0].id = 10; // 内存块前越界
EnhancedFree(p1);
memoryManager.enhancedFree(p1);
memoryManager.enhancedFree(p2);
p1 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
p2 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
memoryManager.traverse();
TestStruct *p3, *p4;
p3 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
p4 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
memoryManager.enhancedFree(p3);
memoryManager.enhancedFree(p1);
memoryManager.enhancedFree(p4);
memoryManager.traverse();
p3 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
p4 = (TestStruct*)memoryManager.enhancedMalloc(sizeof(TestStruct) * 10, __FILE__, __LINE__);
//检测内存是否越界
printf("checkAllPtr starts...\n");
memoryManager.checkAll();
printf("checkAllPtr ends...\n");
printf("================================ new ==============================.\n");
simP5 = new SimBuffer<unsigned short>(10, __FILE__, __LINE__);
simP6 = new SimBuffer<unsigned short>(10, __FILE__, __LINE__);
(*simP5)[10] = 10;
(*simP6)[-1] = 9;
delete simP5;
delete simP6;
return 0;
}