HDP学习--HDFS Storage(中)

462 阅读2分钟

承接HDP学习–HDFS Storage(上)

八、 HDFS Trash

Trash相当于回收站,暂时的将删除的文件和目录移动到 /.Trash/Current, 当文件被其他用户删除, 方便恢复。删除的文件保存在Trash directory
例如:
删除 /user/steve/dir1/fileA
重建在Trash directory:

/user/steve/.Trash/Current/user/steve/dir1/fileA

但也是有限制:
如果你使用 HDFS Shell or the Ambari Files View删除文件, 是会保护的;
如果使用Java API, WebHDFS, the HDFS NFS Gateway, or HUE删除的文件,是不被保护的。
Ttrash有两个属性决定:

Set In core-default.xml 
property:
    fs.trash.checkpoint.interval
    Determines how often the NameNode should checkpoint the .Trash directory.
    0 means use the value set in fs.trash.interval

Set In core-site.xml
property:
    fs.trash.interval
    Determines how often checkpoints in the .Trash directory should be removed. 
    A value of 0 disables trash. 
    The HDP default value is 360 minutes. 

HDFS Shell -rm 命令包含一个 参数:

   -skip Trash  相当于Windows中的永久删除, 步移动回收站

九、 HDFS Trash Operation

下图是Trash的流程:
这里写图片描述
解释:
The fs.trash.checkpoint.interval determines the number of minutes between trash checkpoints. If zero, the value is set to the value of fs.trash.interval. Zero is the HDP default. The number for fs.trash.checkpoint.interval should be smaller than or equal to fs.trash.interval.

Every time the checkpointer runs, it renames the .Trash/Current directory to a new numeric name. For example, .Trash/Current could be renamed to .Trash/150518175000. When new files or directories are deleted, HDFS creates a new .Trash/Current directory to hold them.

How long the older and now renamed checkpoint directory—with its deleted files and directories—is retained is determined by the fs.trash.interval property in core-site.xml. It determines the number of minutes after which the checkpoint directory gets deleted. If zero, the trash feature is disabled. The HDP default is 360 minutes. It is important to note that it is not the individual files and directories that are older that the fs.trash.interval that are deleted, but it is the checkpoint directory that is older than the fs.trash.interval that is deleted.

The fs.trash.interval may be configured both on the server and the client. If trash is disabled on the server side then the client side configuration is checked. If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.

十、 Overriding HDFS Default Properties

这里写图片描述

十一、 Changing File and Directory Ownership

这里写图片描述

十二、Changing File and Directory Permissions

这里写图片描述