通过例子熟悉std::filesystem

19 阅读11分钟

Filesystem through examples

Filesystem library (since C++17) Filesystem source code of libstdc++

一些基本信息

C++17之前,C++/C提供了两种文件操作的机制。一种是C++标准库定义的文件流fstream,另一种是C标准库定义的文件操作类型FILEfstream 更好地集成了C++的特性,如异常处理和类型安全,适合需要C++特性的项目,而 FILE 更适合底层或性能敏感的应用,以及需要与C代码兼容的场景。

尽管fstream提供了针对文件的操作流,但其仍然存在一些问题。比如与C语言的 FILE* 流相比,fstream 可能在某些情况下性能较低,尤其是在需要大量I/O操作的场景中;fstream无法完全屏蔽不同操作系统在文件和路径表示上的差异(分隔符、长度和字符集限制、权限模型、结束符等)

因此,C++17引入了<filesystem>库,这是C++标准中首个专门用于文件系统操作的库。与共享指针、正则表达式一样, filesystem也是由boost.filesystem发展来的,其最终提案是P0218R0: Adopt the File System TS for C++17

filesystem定义了一些核心类型:

  • file:文件对象持有文件的句柄,可以读写数据,包含名称、参数、状态等信息,可以是目录、普通文件、符号链接等
  • path
    • path对象可以隐式转换为std::wstringstd::string。这意味着你可以直接将path对象传递给需要字符串的文件流函数
    • 可以从std::stringconst char*string_view 等字符串类型初始化path对象
    • 提供了begin()end()成员函数,使其可以像容器一样被迭代。这允许你遍历路径中的每个组成部分
    • 处理了不同操作系统间的路径表示差异,提供了跨平台的文件路径操作
  • directory_entry

filesystem在错误处理上,兼容了两种风格:即支持抛出异常,也通过函数重载支持返回错误码。

操作文件夹

创建create_directory

create_directory用于创建一级文件夹,即要求其父路径必须是存在的。如果文件夹是存在的,不会报错。

// 会抛出异常
bool create_directory(const path& __p);
// 会返回错误码
bool create_directory(const path& __p, error_code& __ec) noexcept;
// 示例
#include <filesystem>

{
    // example 1
    std::filesystem::path dir = "the_path_of_dir";
    if (std::filesystem::create_directory(dir)) {
        // do something
    } else {
        // do some other thing
    }
}

{
    // example 2
    std::filesystem::path dir = "the_path_of_dir";
    std::error_code ec{};
    if (std::filesystem::create_directory(dir, ec)) {
        // do something
    } else {
        // do some other thing, `ec.message()` returns a string
    }
}

其重载形式额外支持同步目标文件的权限,existing_p必须是一个存在的文件夹。

bool create_directory(const std::filesystem::path& p,
                      const std::filesystem::path& existing_p );

bool create_directory(const std::filesystem::path& p,
                      const std::filesystem::path& existing_p,
                      std::error_code& ec ) noexcept;

具体什么权限被拷贝,取决于操作系统的实现。在POSIX系统上,其行为类比如下:

stat(existing_p.c_str(), &attributes_stat)
mkdir(p.c_str(), attributes_stat.st_mode)

此外还支持创建多级文件夹,通过接口create_directories

bool create_directories( const std::filesystem::path& p );

bool create_directories( const std::filesystem::path& p, std::error_code& ec );
// 示例
#include <filesystem>
#include <exception>

{
    std::filesystem::path nested = "a/b/c";
    try {
        if (std::filesystem::create_directories(nested))
            // do something
        else
            // do some other thing
    }
    catch (const std::exception& ex) {
        // do exception handling
    }
}

删除

bool remove( const std::filesystem::path& p );

bool remove( const std::filesystem::path& p, std::error_code& ec ) noexcept;

删除文件或空的文件夹。可以删除符号链接,不会删除链接的目标。文件删除返回true、文件不存在返回false

// 示例
#include <filesystem>
#include <exception>

int main() {
    std::filesystem::path dir = "test";
    try {
        if (std::filesystem::create_directory(dir))
            // do something
        else
            // do some other thing

        if (std::filesystem::remove(dir))
            // do something
        else
            // do some other thing
    }
    catch (const std::exception& ex) {
        // do exception handling
    }
}
std::uintmax_t remove_all( const std::filesystem::path& p );
// LWG 3014:C++17中remove_all的error_code重载错误地标记为noexcept,但实际上可能会分配内存。因此,noexcept被移除。
std::uintmax_t remove_all( const std::filesystem::path& p, std::error_code& ec );

递归删除由路径p指定的目录及其所有子目录和内容,然后删除p本身。返回删除的文件和目录的数量。

如果底层操作系统API出现错误,removeremove_all可能会抛出std::filesystem::filesystem_error

// 示例
#include <filesystem>
#include <exception>

int main() {
    std::filesystem::path dir = "test";
    std::filesystem::path nested = dir / "a/b";
    std::filesystem::path more = dir / "x/y";
    try {
        if (std::filesystem::create_directories(nested) &&
            std::filesystem::create_directories(more))
            // do something
        else
            // do some other thing

        const auto cnt = std::filesystem::remove_all(dir);
    }
    catch (const std::exception& ex) {
        // do exception handling
    }
}

遍历

通过directory_iterator可以很方便的完成文件夹的遍历,它会遍历directory_entry对象,但不会递归遍历子文件夹,遍历的顺序是随机的,每个directory_entry对象只访问一次。特殊的路径名(., ..)将被跳过。

迭代器的表现和一般容器迭代的表现类似:

  • 当遍历结束时,迭代器会自动转换为end(),在end()上自增是未定义行为
  • 如果在创建迭代器之后,文件夹中的文件发送变化(子文件、子文件夹的创建和删除),迭代器的行为是未定义的(可能感知变化、可能不感知)
// 示例
#include <filesystem>
#include <iostream>

void ls() {
    for (const auto& entry : std::filesystem::directory_iterator(".")) 
        std::cout << entry.path() << '\n';
}

上述迭代器不支持递归扫描,标准库提供了recursive_directory_iterator,它会递归遍历子文件夹。

// 示例
#include <filesystem>
#include <iostream>

void ls() {
    for (const auto& entry : std::filesystem::recursive_directory_iterator(".")) 
        std::cout << entry.path() << '\n';
}

临时文件夹

filesystem还提供了接口来返回一个临时文件夹,用来存放临时的文件。在POSIX文件系统上,临时文件的路径可以通过环境变量TMPDIR, TMP, TEMP, TEMPDIR设置,或返回/tmp。在Windows系统上,临时文件的路径通常是GetTempPath的返回值。

path temp_directory_path();
path temp_directory_path( std::error_code& ec );
// 示例
#include <filesystem>
#include <iostream>
namespace fs = std::filesystem;
 
int main()
{
    std::cout << "Temp directory is " << fs::temp_directory_path() << '\n';
}

// Possible Output: Temp directory is "C:\Windows\TEMP\"

操作文件

拷贝

在拷贝的语义上,filesystem提供了三个主要的函数,分别是:copycopy_filecopy_symlink

void copy(const std::filesystem::path& from,
          const std::filesystem::path& to );

void copy(const std::filesystem::path& from,
          const std::filesystem::path& to,
          std::filesystem::copy_options options,
          std::error_code& ec );
// 示例
#include <filesystem>

int main() {
    std::filesystem::path src = "source_file.txt";
    std::filesystem::path dest = "destination_file.txt";
    try {
        std::filesystem::copy(src, dest);
    } catch (std::filesystem::filesystem_error& e) {
        // do exception handling
    }
}

如果想要递归的拷贝文件夹,则可以使用copy_options来支持定制化拷贝执行

#include <filesystem>
#include <fstream>

void create_temp_directories_and_files() {
    std::filesystem::create_directories("source_directory/subdir1");
    std::filesystem::create_directories("source_directory/subdir2");

    std::ofstream("source_directory/file1.txt") << "This is file 1";
    std::ofstream("source_directory/subdir1/file2.txt") << "This is file 2";
    std::ofstream("source_directory/subdir2/file3.txt") << "This is file 3";
}

int main() {
    create_temp_directories_and_files();

    std::filesystem::path src = "source_directory";
    std::filesystem::path dest = "destination_directory";
    try {
        std::filesystem::copy(src, dest, std::filesystem::copy_options::recursive);
        // do something
    } catch (std::filesystem::filesystem_error& e) {
        // do exception handling
    }

    for (const auto& entry : std::filesystem::recursive_directory_iterator(dest)) {
        // do something with entry
    }
}

移动和文件重命名

std::filesystem::rename

// 示例
#include <filesystem>
#include <fstream>

{
    std::ofstream("old_file.txt") << "This is file 1";
    std::filesystem::path old_name = "old_file.txt";
    std::filesystem::path new_name = "new_file.txt";
    try {
        std::filesystem::rename(old_name, new_name);
        // do something
    } catch (std::filesystem::filesystem_error& e) {
        // do exception handling
    }
}

创建链接

create_hard_link:创建硬链接

create_symlinkcreate_directory_symlink:创建软链接

// 示例
#include <filesystem>
#include <fstream>
#include <format>

{
    std::ofstream("target_file.txt") << "This is file 1";
    std::filesystem::path target = "target_file.txt";
    std::filesystem::path link = "hard_link_file.txt";
    try {
        std::filesystem::create_hard_link(target, link);
        // do something
    } catch (std::filesystem::filesystem_error& e) {
        // do exception handling
    }
}

下面的例子可以更好的理解硬链接和软链接的区别:

// 示例
#include <filesystem>
#include <iostream>
#include <fstream>

void display_file_content(const std::filesystem::path& path) {
    if (std::filesystem::exists(path)) {
        std::ifstream file(path);
        std::string content((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());
        std::cout << "Content of " << path << ": " << content << '\n';
    } else {
        std::cout << path << " does not exist.\n";
    }
}

int main() {
    std::filesystem::path original_file = "original_file.txt";
    std::filesystem::path symlink = "symlink_to_file.txt";
    std::filesystem::path hardlink = "hardlink_to_file.txt";

    // Step 1: Create the original file
    std::ofstream(original_file) << "Hello World!";
    std::cout << "Original file created.\n";
    display_file_content(original_file);

    // Step 2: Create a symbolic link to the original file
    try {
        std::filesystem::create_symlink(original_file, symlink);
        std::cout << "Symbolic link created successfully.\n";
        display_file_content(symlink);
    } catch (std::filesystem::filesystem_error& e) {
        std::cout << e.what() << '\n';
    }

    // Step 3: Create a hard link to the original file
    try {
        std::filesystem::create_hard_link(original_file, hardlink);
        std::cout << "Hard link created successfully.\n";
        display_file_content(hardlink);
    } catch (std::filesystem::filesystem_error& e) {
        std::cout << e.what() << '\n';
    }

    // Step 4: Delete the original file and compare...
    std::filesystem::remove(original_file);
    std::cout << "Original file deleted.\n";
    display_file_content(symlink);
    display_file_content(hardlink);
}

操作文件路径

检查存在性

通过std::filesystem::exists检查文件和文件夹的存在性

//
#include <filesystem>

{
    std::filesystem::path p = "example_file.txt";
    if (std::filesystem::exists(p))
        // do something
    else
        // do some other thing
}

判断是否是文件夹

std::filesystem::is_regular_file用来识别文件,std::filesystem::is_directory用来识别文件夹

// 示例
#include <filesystem>

{
    std::filesystem::path p = "example_path";
    if (std::filesystem::is_regular_file(p))
        // do something
    else if (std::filesystem::is_directory(p))
        // do something
    else
        // do some other thing
}

除了这两个接口之外,filesystem还提供了一些其他的工具函数,判断文件是否是链接、是否是socket等,他们的用法都是类似的。

读取链接文件的指向

std::filesystem::read_symlink

std::filesystem::path read_symlink( const std::filesystem::path& p );
// 示例
#include <filesystem>
#include <iostream>
#include <fstream>

int main() {
    std::filesystem::path original_file = "original_file.txt";
    std::filesystem::path symlink = "symlink_to_file.txt";

    std::ofstream(original_file) << "Hello World!";

    try {
        std::filesystem::create_symlink(original_file, symlink);
        
    } catch (std::filesystem::filesystem_error& e) {
        // do exception handling
    }

    if (std::filesystem::is_symlink(symlink)) {
        std::cout << symlink << " is a symbolic link.\n";
        std::filesystem::path target = std::filesystem::read_symlink(symlink);
        std::cout << "It points to: " << target << '\n';
    } else {
        // do some other thing
    }
}

获取绝对路径和相对路径

std::filesystem::absolute

std::filesystem::relative

// 示例
#include <filesystem>
#include <iostream>

int main() {
    {
        // Example relative paths
        std::filesystem::path relative_path1 = "example_directory";
        std::filesystem::path relative_path2 = "../parent_directory";
    
        // Convert to absolute paths
        std::filesystem::path absolute_path1 = std::filesystem::absolute(relative_path1);
        std::filesystem::path absolute_path2 = std::filesystem::absolute(relative_path2);

        // Display the absolute paths
        std::cout << "Relative path: " << relative_path1 << " -> Absolute path: " << absolute_path1 << '\n';
        std::cout << "Relative path: " << relative_path2 << " -> Absolute path: " << absolute_path2 << '\n';
    }

    {
        std::filesystem::path base_path = "/home/user";
        std::filesystem::path absolute_path = "/home/user/example_directory/file.txt";
        std::filesystem::path relative_path = std::filesystem::relative(absolute_path, base_path);
        std::cout << "Relative path: " << relative_path << '\n';
    }
}

显式路径时去掉引号

在针对std::filesystem::path进行operator<<, operator>>进行重载时,为了保证路径中存在的空格不会导致字符串截断,会自动使用std::quoted来保护路径。

Performs stream input or output on the path p. std::quoted is used so that spaces do not cause truncation when later read by stream input operator.

如果想要去除引号,可以通过std::filesystem::path::c_str获取路径的原生表示。

// 示例
for (const auto& entry : std::filesystem::directory_iterator(".")) 
        std::cout << entry.path().c_str() << '\n';

其他操作

统计大小

std::filesystem::file_size可以用来返回一个文件或符号链接的大小,在POSIX文件系统上,它实际上是读取stat结构的st_size字段。如果想要通过这个接口获取一个文件夹的大小,其行为是由实现定义的(implementation-defined)

std::filesystem::space用来统计路径下的可用空间,类似POSIX文件系统上的statvfs。它会返回一个 filesystem::space_info的对象,包含所指向路径的容量、可用等信息。

// 示例
#include <filesystem>
#include <iostream>
#include <fstream>

// Function to create a test directory with some files and subdirectories
void create_test_directory(const std::filesystem::path& dir) {
    std::filesystem::create_directories(dir / "subdir1");
    std::filesystem::create_directories(dir / "subdir2");

    std::ofstream(dir / "file1.txt") << "ABC";
    std::ofstream(dir / "subdir1/file2.txt") << "XYZ";
    std::ofstream(dir / "subdir2/file3.txt") << "123";
}

// Function to calculate the total size of a directory
std::uintmax_t calculate_directory_size(const std::filesystem::path& dir) {
    std::uintmax_t size = 0;
    for (const auto& entry : std::filesystem::recursive_directory_iterator(dir)) {
        if (std::filesystem::is_regular_file(entry.path())) {
            size += std::filesystem::file_size(entry.path());
        }
    }
    return size;
}

int main() {
    {
        // Create a test directory with some files and subdirectories
        std::filesystem::path test_dir = "test_directory";
        create_test_directory(test_dir);
        std::cout << "Test directory created.\n";

        // Calculate the total size of the test directory
        std::uintmax_t total_size = calculate_directory_size(test_dir);
        std::cout << "Total size of directory " << test_dir << ": " << total_size << " bytes\n";

        // Clean up by removing the test directory and its contents
        std::filesystem::remove_all(test_dir);
        std::cout << "Test directory removed.\n";
    }

    {
        std::filesystem::path p = "/";
        auto space_info = std::filesystem::space(p);
        std::cout << "Free space: " << space_info.free << " bytes\n";
        std::cout << "Available space: " << space_info.available << " bytes\n";
        std::cout << "Capacity: " << space_info.capacity << " bytes\n";
    }
}

权限操作

std::filesystem::perms定义了文件权限

std::filesystem::file_status::permissions用来获取文件的权限

std::filesystem::permissions用来设置文件权限

// 示例
#include <filesystem>
#include <fstream>
#include <iostream>
 
void demo_perms(std::filesystem::perms p)
{
    using std::filesystem::perms;
    auto show = [=](char op, perms perm)
    {
        std::cout << (perms::none == (perm & p) ? '-' : op);
    };
    show('r', perms::owner_read);
    show('w', perms::owner_write);
    show('x', perms::owner_exec);
    show('r', perms::group_read);
    show('w', perms::group_write);
    show('x', perms::group_exec);
    show('r', perms::others_read);
    show('w', perms::others_write);
    show('x', perms::others_exec);
    std::cout << '\n';
}
 
int main()
{
    std::ofstream("test.txt"); // create file
 
    std::cout << "Created file with permissions: ";
    demo_perms(std::filesystem::status("test.txt").permissions());
 
    std::filesystem::permissions(
        "test.txt",
        std::filesystem::perms::owner_all | std::filesystem::perms::group_all,
        std::filesystem::perm_options::add
    );
 
    std::cout << "After adding u+rwx and g+rwx:  ";
    demo_perms(std::filesystem::status("test.txt").permissions());
 
    std::filesystem::remove("test.txt");
}

Refs