Linux系统下的文件管理

1.1 静态文件
1.2 inode

Linux系统下的错误处理

2.1 errno
2.2 strerror( )函数
2.3 perror( )函数

Linux下的程序终止和退出

3.1 exit( )、_exit( )、_Exit( )

空洞文件
Open函数的两个实用标志

5.1 O_APPEND 标志

同个文件被多次打开

6.1 同一个文件被多次打开，内存中只维护一份动态文件
6.2 多个进程都使用O_APPEND 标志打开同一份文件

1 Linux系统下的文件管理

1.1 静态文件

静态文件：即以一种固定的形式存放在硬盘中，例如电脑的硬盘、U盘等外部存储设备。

硬盘的最小存储单元为扇区(sector)，每个扇区的大小为512个字节，多个扇区组成一个块(block)，最常见的块大小是4KB，即8个扇区组成1个块。一般不会一个一个扇区的读取，而是一次性读取一个块。

1.2 inode

inode：索引节点，inode记录着包含创建时间、从属组信息、权限、所占大小、数据存储地址等文件信息。每个文件都有唯一的一个inode，每个inode都对应一个数字编号。

磁盘进行分区或格式化时，会分为两个区域，一个为数据区，而另一个即为inode区，用于存放ionde table，既然inode记录了这些文件信息，inode table也就需要占用磁盘空间。

如下图1.2.1和图1.2.2所示，通过ls -il命令可以查看到文件的inode编号，通过stat命令也可以查看到inode编号。

图1.2.1 ls命令查询inode

图1.2.2 stat命令查询inode

打开一个文件的过程：系统找到对应文件的inode编号，根据inode编号从inode table找到inode结构体，在inode结构体中读取记录的文件信息，确定文件数据所在的块，并读取数据。

2 Linux系统下的错误处理

2.1 errno

errno：Linux下的一个用于存储错误编号的int类型，当函数返回错误时会设置errno。在程序中需要获取和使用errno变量，需要包含<errno.h>头文件。

errno只是一个错误编号，并不显示具体的错误原因，例如将错误编号打印示例如下：

#include<stdio.h>
#include<errno.h>

int main(void)
{
    printf("errno:%d\n\r",errno);
    return 0;
}

2.2 strerror( )函数

errno只是一个错误编号的int类型变量，对于开发和维护并不友好，故可以使用C库的strerror( )函数来获取具体的错误原因，strerror( )函数为库函数，并不属于系统调用，调用该函数需要包含头文件<string.h>。

strerror( )函数的传入参数为int类型的错误编码errno，返回字符串，可通过打赢输出具体的错误原因。例如：

#include<stdio.h>
#include<errno.h>
#include<string.h>

int main(void)
{
    printf("Error:%s\n\r",strerror(errno));
    return 0;
}

2.3 perror( )函数

除了使用strerror( )能打印出具体错误原因外，一般使用最多还是perror( )函数，该函数不需要传入具体的错误编码errno,调用该函数时，函数内部会自动获取errno变量当前的值，返回相对应的错误原因，该函数还能传入在错误打印信息前加入自己的打印信息。

例如

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include<unistd.h>
#include<stdio.h>

int main(void)
{
    int fd;
    
    fd = open("./home/abc.c",O_RDONLY);
    if( fd == -1)
    {
        perror(error);
        return -1;
    }
    
    close(fd);
    return 0;
}

3 Linux下的程序终止和退出

3.1 exit( )、_exit( )、_Exit( )

一般程序在执行出错，而不能继续进行下面程序时，通常会使用return，使用return 0则表示正常退出,将控制权返回给上一级，return -1则表示函数出错退出。但是在Linux系统下，函数的正常退出除了使用return，还可以使用exit( )、_exit( )、_Exit( )函数。

_exit( )等价于_Exit( )，都是属于系统调用，而exit()是一个标准 C 库函数，需要包含头文件<stdlib.h>，exit()函数封装了_exit()，但使用方法与_exit( )和_Exit( )一样。

exit()函数在执行时，系统会检测进程打开文件情况，并将处于文件缓冲区的内容写入到文件当中再退出。而_exit()则直接退出，不会将缓冲区中内容写入文件。

调用函数时需要传入 status 状态标志，0表示正常退出，其它值则表示程序执行过程发生了错误。调用该函数后，会清除使用的内存空间，销毁其在内核中的各种数据结构，关闭进程的所有文件描述符，结束进程、并将控制权交给操作系统。

例如：

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include<unistd.h>
#include<stdio.h>

int main(void)
{
    int fd;
    
    fd = open("./home/abc.c",O_RDONLY);
    if( fd == -1)
    {
       perror(error);
       _exit(-1);
       //exit(-1);
       //_Exit(-1);
    }
    
    close(fd);
    _exit(0);
    //exit(0);
    //_Exit(0);
}

4 空洞文件

什么是空洞文件？

例如一个文件的当前大小是4K，此时使用lseek函数将文件头偏移6k处，那么此时4k~6k就存在空洞。

空洞有什么好处？

空洞文件对多线程编辑文件来说是很有用处的，当创建一个超大文件时，可以划分成很多段，每个线程可以在规定的区域内编辑自己的空洞区域，这样就比单线程编辑文件来说快很多。

例如：创建一个文件，将文件头偏移到2k处，再写入4k的数据，文件大小为6k，其中2k的空洞区域和4k的数据。

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void)
{
    int fd,ret;
    unsigned char buf[4096];
    fd = open("./home/abc.c",O_WRONLY | O_CREAT | O_EXCL,
                             S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)
    if(fd == -1)
    {
        perror(OpenError);
        exit(-1)
    }
    
    ret = lseek(fd, 2048, SEEK_SET)
    
    if(ret == -1)
    {
        perror(lseekError);
        exit(-1)
    }
    
    memset(buf,0xff,sizeof(buf)); //初始化buf
    
    ret = write(fd, buf, sizeof(buf));
    
    if(ret == -1)
    {
        perror(WriteError);
        exit(-1)
    }
    
    close(fd);
    return 0;
}

5 Open函数的两个实用标志

5.1 O_APPEND 标志

当使用open函数带有O_APPEND 标志打开一个文件，每次使用write写入数据时，都会自动把文件位置偏移量移动到末尾。

示例：以O_APPEND 标志打开一个已写有数据的文件，直接使用write函数写入4个字节的0xff，然后将位置偏移量移动到距离文件末尾前4个字节处，再读取4个字节后进行打印输出，结果应为写入4个字节0ff。

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    int fd,ret;
    unsigned char buf[16];
    
    fd = open("./home/abc.c", O_RDWR | O_APPEND);
    if(fd == -1)
    {
        perror(OpenError);
        exit(-1);
    }
    
    memset(buf, 0xff, sizeof(buf)); //初始化为0xff
   
    ret = write(fd,buf,4)；
    if(ret == -1)
    {
        perror(WriteError);
        exit(-1);
    }
   
   memset(buf, 0x00, sizeof(buf)); //初始化为0x00
   
   ret = lseek(fd,-4,SEEK_END)
   if(ret == -1)
   {
       perror(lseekError);
       exit(0);
   }
   
   ret = read(fd,buf,4)
   if(ret == -1)
   {
       perror(ReadError);
       exit(0);
   }
   
   for(int i = 0; i < 4; i++)
   {
       printf("0x%x",buf[i]);
   }
   
    close(fd);
    return 0;        

}

5.2 O_TRUNC 标志

该标志主要在以Open函数打开的时候会把文件内容清除，文件大小变为0。例如：用O_TRUNC标志打开一个大小不为0的文件,再使用ls -l命令查看文件大小。

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{ 
    int fd = Open(“./home/abc.c”, O_RDWR | O_TRUNC);
    if(fd == -1)
    {
        perror(OpneError);
        exit(0);
    }
    close(fd);
    return 0;
}

6 同个文件被多次打开

6.1 同一个文件被多次打开，内存中只维护一份动态文件

同一个文件在一个进程内被多次打开，也会得到不同文件描述符，关闭文件时同样也要依次关闭相应的文件文件描述符。

例如:同时打开同一个文件，将文件描述符打印出来，会发现文件描述符不一致，当使用不同权限的方式打开同一个文件，那么对应的文件描述符就对应打开的权限。

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int fd1,fd2,fd3;
    fd1 = open("./home/hello.c",O_RDONLY);
    if(fd1 == -1)
    {
        perror(fd1Openerror);
        exit(-1);
    }
     
    fd2 = open("./home/hello.c",O_WRONLY);
    if(fd2 == -1)
    {
        perror(fd2Openerror);
        exit(-1);
    }
    
    fd3 = open("./home/hello.c",O_RDWR);
    if(fd3 == -1)
    {
        perror(fd3Openerror);
        exit(-1);
    }
    
    printf("fd1=%d,fd2=%d,fd3=%d\r\n",fd1,fd2,fd3);

    close(fd1);
    close(fd2);
    close(fd3);
    return 0;
}

一个进程中打开同一个文件，内存中只存在一份动态文件。

例如：打开创建一个文件，文件描述符为fd1,再次打开文件，文件描述符为fd2，通过fd1从头部写入数据0x88,通过fd2从头部读取数据，读到的4个字节的数据为0x88。

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int fd1,fd2;
    int ret;
    unsigned char buf[4];
    
    memset(buf,0x88,sizeof(buf));
    
    fd1 = open("./home/a.c", O_RDWR | O_CREAT | O_EXCL,S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
    if(fd1 == -1)
    {
        perror(fd1OpenError);
        exit(-1);
    }
    
    fd2 = open("./home/a.c", O_RDWR);
    if(fd2 == -1)
    {
        perror(fd2OpenError);
        exit(-1);
    }
    
    ret = write(fd1, buf, 4);
    if(ret == -1)
    {
        perror(fd1WriteError);
        exit(-1)
    }
    
    ret = lseek(fd2, 0, SEEK_SET);
    if(ret == -1)
    {
        perror(fd2lseekError);
        exit(-1); 
    }
    
    memset(buf, 0x00, sizeof(buf));
    ret = read(fd2, buf, 4);
    if(ret == -1)
    {
        perror(fd2ReadError);
        exit(-1); 
    }  
    
    printf("0x%x,0x%x,0x%x,0x%x\r\n",buf[0],buf[1],buf[2],buf[3]);
    
    close(fd1);
    close(fd2);
    return 0;
}

同一个文件被多个不同的进程调用，open()打开的是同一个文件，在内存中也只是维护一份动态文件，多个进程间共享，它们有各自独立的文件读写位置偏移量。同一个文件被打开一次，文件表会记录引用次数，引用次数为0时，动态文件将会关闭。

6.2 多个进程都使用O_APPEND 标志打开同一份文件

O_APPEND 标志是在使用write函数时，位置偏移量会自动移动到末尾，如果当多个进程都使用O_APPEND 标志打开同一份文件时，每次写入的数据都会直接写到末尾。例如，使用O_APPEND 标志打开创建一个文件，文件描述符为fd1，再次使用O_APPEND 标志打开该文件，文件描述符为fd2，从头部开始，fd1和fd2轮流写入一个字节，分别写入4次，最后从文件中读取8个字节的数据是否交替的。

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    unsigned char buf1[4],buf2[4],buf3[8];
    int fd1,fd2;
    int ret;
    int i;
    
    memset(buf1, 0x55, sizeof(buf1));
    memset(buf2, 0xAA, sizeof(buf2));
    memset(buf3, 0x00, sizeof(buf3));
    
    fd1 = open("./home/a.c", O_RDWR | O_CREAT | O_EXCL,S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
    if(fd1 == -1)
    {
        perror(fd1OpenError);
        exit(-1);
    }
    
    fd2 = open("./home/a.c", O_RDWR)
    if(fd2 == -1)
    {
        perror(fd2OpenError);
        exit(-1);
    }

    for(i = 0, i < 4, i++)
    {
    
        ret = write(fd1, buf1[i], 1);
        if(ret == -1)
        {
            perror(fd1WriteError);
            exit(-1);
        }
        
        ret = write(fd2, buf2[i], 1);
        if(ret == -1)
        {
            perror(fd2WriteError);
            exit(-1);
        }
    
    }
    
    ret = lseek(fd1, 0, SEEK_SET);
    
    if(ret == -1)
    {
        perror(fd2lseekError);
        exit(-1);
    }
    
    ret = read(fd1, buf3, 8);
    if(ret == -1)
    {
        perror(ReadError);
        exit(-1) ;
    }
    
    for(i = 0, i < 8, i++)
    {
        printf("0x%x\r\n",buf3[i]);
    }

    close(fd1);
    close(fd2);
    
    return 0;
}

Linux下文件的深入了解（上篇）---学习笔记二