HDFS的Shell操作

78 阅读5分钟

基本语法

hadoop fs 具体命令 OR hdfs dfs 具体命令

两个是完全相同的

命令大全

[muyi@hadoop102 ~]$ hadoop fs
Usage: hadoop fs [generic options]
	[-appendToFile <localsrc> ... <dst>]
	[-cat [-ignoreCrc] <src> ...]
	[-checksum <src> ...]
	[-chgrp [-R] GROUP PATH...]
	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
	[-chown [-R] [OWNER][:[GROUP]] PATH...]
	[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
	[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...]
	[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
	[-createSnapshot <snapshotDir> [<snapshotName>]]
	[-deleteSnapshot <snapshotDir> <snapshotName>]
	[-df [-h] [<path> ...]]
	[-du [-s] [-h] [-v] [-x] <path> ...]
	[-expunge]
	[-find <path> ... <expression> ...]
	[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-getfacl [-R] <path>]
	[-getfattr [-R] {-n name | -d} [-e en] <path>]
	[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
	[-head <file>]
	[-help [cmd ...]]
	[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
	[-mkdir [-p] <path> ...]
	[-moveFromLocal <localsrc> ... <dst>]
	[-moveToLocal <src> <localdst>]
	[-mv <src> ... <dst>]
	[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
	[-renameSnapshot <snapshotDir> <oldName> <newName>]
	[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
	[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
	[-setfattr {-n name [-v value] | -x name} <path>]
	[-setrep [-R] [-w] <rep> <path> ...]
	[-stat [format] <path> ...]
	[-tail [-f] [-s <sleep interval>] <file>]
	[-test -[defsz] <path>]
	[-text [-ignoreCrc] <src> ...]
	[-touch [-a] [-m] [-t TIMESTAMP ] [-c] <path> ...]
	[-touchz <path> ...]
	[-truncate [-w] <length> <path> ...]
	[-usage [cmd ...]]

Generic options supported are:
-conf <configuration file>        specify an application configuration file
-D <property=value>               define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port>  specify a ResourceManager
-files <file1,...>                specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...>               specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...>          specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[muyi@hadoop102 ~]$ 

常用命令操作

准备工作

启动集群

[muyi@hadoop102 ~]$ myhadoop.sh start
 =================== 启动 hadoop 集群 ===================
 --------------- 启动 hdfs ---------------
Starting namenodes on [hadoop102]
Starting datanodes
Starting secondary namenodes [hadoop104]
 --------------- 启动 yarn ---------------
Starting resourcemanager
Starting nodemanagers
 --------------- 启动 historyserver ---------------
[muyi@hadoop102 ~]$ jps
2569 DataNode
3066 JobHistoryServer
3133 Jps
2878 NodeManager
2399 NameNode
[muyi@hadoop102 ~]$ 

-help:输出这个命令参数

[muyi@hadoop102 ~]$ hadoop fs -help rm
-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ... :
  Delete all files that match the specified file pattern. Equivalent to the Unix
  command "rm <src>"
                                                                                 
  -f          If the file does not exist, do not display a diagnostic message or 
              modify the exit status to reflect an error.                        
  -[rR]       Recursively deletes directories.                                   
  -skipTrash  option bypasses trash, if enabled, and immediately deletes <src>.  
  -safely     option requires safety confirmation, if enabled, requires          
              confirmation before deleting large directory with more than        
              <hadoop.shell.delete.limit.num.files> files. Delay is expected when
              walking over large directory recursively to count the number of    
              files to be deleted before the confirmation. 

创建sanguo文件夹

[muyi@hadoop102 ~]$ hadoop fs -mkdir /sanguo

图片.png

上传

-moveFromLocal :从本地剪切粘贴到HDFS

[muyi@hadoop102 HDFSDemo]$ vim shuguo.txt

shuguo

[muyi@hadoop102 HDFSDemo]$ hadoop fs -moveFromLocal ./shuguo.txt /sanguo
2024-11-15 07:28:15,351 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[muyi@hadoop102 HDFSDemo]$ ll
总用量 0
[muyi@hadoop102 HDFSDemo]$ 

图片.png

-copyFromLocal:从本地文件系统中拷贝文件到HDFS路径去

[muyi@hadoop102 HDFSDemo]$ ll
总用量 0
[muyi@hadoop102 HDFSDemo]$ vim weiguo.txt

weiguo

[muyi@hadoop102 HDFSDemo]$ hadoop fs -copyFromLocal weiguo.txt /sanguo
2024-11-15 07:30:55,456 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[muyi@hadoop102 HDFSDemo]$ ll
总用量 4
-rw-rw-r--. 1 muyi muyi 7 1115 07:30 weiguo.txt
[muyi@hadoop102 HDFSDemo]$ 

图片.png

-put:等同于copyFromLocal,生产环境更习惯使用put

[muyi@hadoop102 HDFSDemo]$ ll
总用量 4
-rw-rw-r--. 1 muyi muyi 7 1115 07:30 weiguo.txt
[muyi@hadoop102 HDFSDemo]$ cp weiguo.txt weiguo_2.txt
[muyi@hadoop102 HDFSDemo]$ ll
总用量 8
-rw-rw-r--. 1 muyi muyi 7 1115 07:35 weiguo_2.txt
-rw-rw-r--. 1 muyi muyi 7 1115 07:30 weiguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -put weiguo_2.txt /sanguo
2024-11-15 07:36:23,968 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

图片.png

-appendToFile:追加一个文件到已经存在的文件末尾

图片.png

[muyi@hadoop102 HDFSDemo]$ vim liubie.txt

liubei

[muyi@hadoop102 HDFSDemo]$ hadoop fs -appendToFile ./liubie.txt /sanguo/shuguo.txt
2024-11-15 07:41:32,623 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

图片.png

下载

-copyToLocal:从 HDFS 拷贝到本地

[muyi@hadoop102 HDFSDemo]$ ll
总用量 0
[muyi@hadoop102 HDFSDemo]$ 
[muyi@hadoop102 HDFSDemo]$ hadoop fs -copyToLocal /sanguo/shuguo.txt ./
2024-11-15 07:44:36,215 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[muyi@hadoop102 HDFSDemo]$ ll
总用量 4
-rw-r--r--. 1 muyi muyi 14 1115 07:44 shuguo.txt

-get:等同于 copyToLocal,生产环境更习惯用get

[muyi@hadoop102 HDFSDemo]$ ll
总用量 4
-rw-r--r--. 1 muyi muyi 14 11月 15 07:44 shuguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -get /sanguo/weiguo.txt ./
2024-11-15 07:46:56,687 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[muyi@hadoop102 HDFSDemo]$ ll
总用量 8
-rw-r--r--. 1 muyi muyi 14 11月 15 07:44 shuguo.txt
-rw-r--r--. 1 muyi muyi  7 11月 15 07:46 weiguo.txt

HDFS直接操作

-ls:显示目录信息

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /sanguo
Found 3 items
-rw-r--r--   3 muyi supergroup         14 2024-11-15 07:41 /sanguo/shuguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:30 /sanguo/weiguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:36 /sanguo/weiguo_2.txt

-cat:显示文件内容

[muyi@hadoop102 HDFSDemo]$ hadoop fs -cat /sanguo/shuguo.txt
2024-11-15 07:49:18,862 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
shuguo
liubei

-chgrp -chmod -chown :Linux文件系统中的用法一样,修改文件所属权限

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /sanguo
Found 3 items
-rw-r--r--   3 muyi supergroup         14 2024-11-15 07:41 /sanguo/shuguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:30 /sanguo/weiguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:36 /sanguo/weiguo_2.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -chmod 666 /sanguo/shuguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /sanguo
Found 3 items
-rw-rw-rw-   3 muyi supergroup         14 2024-11-15 07:41 /sanguo/shuguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:30 /sanguo/weiguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:36 /sanguo/weiguo_2.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -chown muyi:muyi /sanguo/shuguo.txt

在Linux系统中,-chgrp-chmod-chown 是三个用于管理文件权限和所有权的命令。下面是它们各自的作用:

  1. chgrp (Change Group)
    • 作用:更改文件或目录的所属组。
    • 基本语法:chgrp [选项] 新组 文件...
    • 示例:
      chgrp newgroup filename
      
    • 这条命令会把filename的组所有权更改为newgroup。你需要有足够的权限才能更改文件的组。
  2. chmod (Change Mode)
    • 作用:更改文件或目录的访问权限。
    • 基本语法:chmod [选项] 模式 文件...
    • 模式可以是符号模式(symbolic mode)或者八进制模式(octal mode)。
    • 符号模式示例:
      chmod u+x filename  # 给文件所有者添加执行权限
      chmod go-rw filename  # 移除组和其他用户的读写权限
      
    • 八进制模式示例:
      chmod 785 filename  # 设置权限为rwxr--r-x
      
    • 其中数字代表:
      • 7 (rwx) 读、写、执行权限
      • 6 (rw-) 读、写权限
      • 5 (r-x) 读、执行权限
      • 4 (r--) 只读权限
      • 3 (-wx) 写、执行权限
      • 2 (-w-) 只写权限
      • 1 (--x) 只执行权限
      • 0 (---) 没有权限
    • 第一位数字对应文件所有者的权限,第二位数字对应文件所属组的权限,第三位数字对应其他用户的权限。
  3. chown (Change Owner)
    • 作用:更改文件或目录的所有者和/或所属组。
    • 基本语法:chown [选项] [新所有者][:[新组]] 文件...
    • 示例:
      chown newowner filename  # 更改文件所有者
      chown newowner:newgroup filename  # 同时更改文件所有者和所属组
      chown :newgroup filename  # 只更改文件所属组
      
    • 需要超级用户权限或者文件当前所有者的权限才能更改文件的所有权。

使用这些命令时,务必小心,特别是当你使用超级用户权限时,因为错误的操作可能会导致安全问题或数据丢失。确保你理解了每个命令的效果后再进行操作。

-mkdir:创建路径

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /
Found 5 items
drwxr-xr-x   - muyi supergroup          0 2024-11-13 09:04 /output
drwxr-xr-x   - muyi supergroup          0 2024-11-13 09:46 /output2
drwxr-xr-x   - muyi supergroup          0 2024-11-15 07:36 /sanguo
drwxrwx---   - muyi supergroup          0 2024-11-13 09:46 /tmp
drwxr-xr-x   - muyi supergroup          0 2024-11-13 08:59 /wcinput
[muyi@hadoop102 HDFSDemo]$ hadoop fs -mkdir /jinguo
[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /
Found 6 items
drwxr-xr-x   - muyi supergroup          0 2024-11-15 07:55 /jinguo
drwxr-xr-x   - muyi supergroup          0 2024-11-13 09:04 /output
drwxr-xr-x   - muyi supergroup          0 2024-11-13 09:46 /output2
drwxr-xr-x   - muyi supergroup          0 2024-11-15 07:36 /sanguo
drwxrwx---   - muyi supergroup          0 2024-11-13 09:46 /tmp
drwxr-xr-x   - muyi supergroup          0 2024-11-13 08:59 /wcinput

-cp:从HDFS的一个路径拷贝到HDFS的另一个路径

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /jinguo
[muyi@hadoop102 HDFSDemo]$ hadoop fs -cp /sanguo/shuguo.txt /jinguo
2024-11-15 07:57:02,267 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2024-11-15 07:57:02,381 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /jinguo
Found 1 items
-rw-r--r--   3 muyi supergroup         14 2024-11-15 07:57 /jinguo/shuguo.txt

-mv:在HDFS目录中移动文件

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /jinguo
Found 1 items
-rw-r--r--   3 muyi supergroup         14 2024-11-15 07:57 /jinguo/shuguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -mv /sanguo/weiguo.txt /jinguo
[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /jinguo
Found 2 items
-rw-r--r--   3 muyi supergroup         14 2024-11-15 07:57 /jinguo/shuguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:30 /jinguo/weiguo.txt

-tail:显示一个文件的末尾 1kb 的数据

[muyi@hadoop102 HDFSDemo]$ hadoop fs -tail /jinguo/shuguo.txt
2024-11-15 08:09:19,961 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
shuguo
liubei

-rm:删除文件或文件夹

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /sanguo
Found 2 items
-rw-rw-rw-   3 muyi supergroup         14 2024-11-15 07:41 /sanguo/shuguo.txt
-rw-r--r--   3 muyi supergroup          7 2024-11-15 07:36 /sanguo/weiguo_2.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -rm /sanguo/weiguo_2.txt
Deleted /sanguo/weiguo_2.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /sanguo
Found 1 items
-rw-rw-rw-   3 muyi supergroup         14 2024-11-15 07:41 /sanguo/shuguo.txt

-rm -r:递归删除目录及目录里面内容

[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /sanguo
Found 1 items
-rw-rw-rw-   3 muyi supergroup         14 2024-11-15 07:41 /sanguo/shuguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -rm -r /sanguo
Deleted /sanguo
[muyi@hadoop102 HDFSDemo]$ hadoop fs -ls /
Found 5 items
drwxr-xr-x   - muyi supergroup          0 2024-11-15 08:07 /jinguo
drwxr-xr-x   - muyi supergroup          0 2024-11-13 09:04 /output
drwxr-xr-x   - muyi supergroup          0 2024-11-13 09:46 /output2
drwxrwx---   - muyi supergroup          0 2024-11-13 09:46 /tmp
drwxr-xr-x   - muyi supergroup          0 2024-11-13 08:59 /wcinput

-du:统计文件夹的大小信息

[muyi@hadoop102 HDFSDemo]$ hadoop fs -du -s -h /jinguo
21  63  /jinguo
[muyi@hadoop102 HDFSDemo]$ hadoop fs -du -h /jinguo
14  42  /jinguo/shuguo.txt
7   21  /jinguo/weiguo.txt

说明:21 表示文件大小;63表示 21*3 个副本; /jinguo 表示查看的目录

-setrep:设置HDFS中文件的副本数量

[muyi@hadoop102 HDFSDemo]$ hadoop fs -setrep 10 /jinguo/shuguo.txt
Replication 10 set: /jinguo/shuguo.txt
  • hadoop fs:这是调用Hadoop文件系统的命令接口。
  • -setrep:这是设置副本数量的选项。
  • 10:这是你想要设置的目标副本数量。
  • /jinguo/shuguo.txt:这是你要改变副本数量的文件路径。
[muyi@hadoop102 HDFSDemo]$ hadoop fs -du -h /jinguo
14  42  /jinguo/shuguo.txt
7   21  /jinguo/weiguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -setrep 10 /jinguo/shuguo.txt
Replication 10 set: /jinguo/shuguo.txt
[muyi@hadoop102 HDFSDemo]$ hadoop fs -du -h /jinguo
14  140  /jinguo/shuguo.txt
7   21   /jinguo/weiguo.txt

图片.png

这里设置的副本数只是记录在 NameNode 的元数据中,是否真的会有这么多副本,还得看 DataNode 的数量。因为目前只有 3 台设备,最多也就 3 个副本,只有节点数的增加到 10台时,副本数才能达到 10。