启动hadoop3.1.3-踩坑记录

1,290 阅读4分钟

错误一:Attempting to operate on hdfs namenode as root

[root@VM-16-2-centos hadoop-3.1.3]# sbin/start-dfs.sh Starting namenodes on [hadoop102]
ERROR: Attempting to operate on hdfs namenode as root 
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation. 
Starting datanodes ERROR: Attempting to operate on hdfs datanode as root 
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation. 
Starting secondary namenodes [hadoop104] 
ERROR: Attempting to operate on hdfs secondarynamenode as root 
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.

解决方案:

vi /etc/profile

# 在文件里添加如下配置
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root

source /etc/profile

错误二:ERROR: JAVA_HOME is not set and could not be found.

解决方案:在您的 Hadoop 配置中设置 JAVA_HOME 变量。您可以通过编辑位于 Hadoop 安装目录下的 etc/hadoop/ 目录中的 hadoop-env.sh 文件来完成这个操作。

  • 使用文本编辑器打开 etc/hadoop/hadoop-env.sh
  • 找到设置 JAVA_HOME 的行并修改它,指向您的 Java 安装目录。根据您的 java -version 输出,Java 已经安装,您只需要正确设置它。例如: 复制代码 export JAVA_HOME=/path/to/java/jdk
  • 将 /path/to/java/jdk 替换为您的 Java 安装路径。如果您不确定路径,可以通过运行 which java 或 readlink -f $(which java) 来找到,然后追溯到 JDK 安装的根目录。

错误三:Permission denied

类似 Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). 的错误表明主节点和从节点之间的 SSH 认证问题。

ssh-keygen -t rsa

# 然后一路回车,生成密钥

# 然后把公钥复制到其他节点
ssh-copy-id haoop103
ssh-copy-id haoop104

注意:比如你当前在hadoop102节点,则需要复制公钥到103和104,同理,还需要去103和104节点同样这么操作,复制公钥到另外两个节点

错误四:端口号被占用

最大的坑出现了。一直出现一下错误提示,可以查看nameNode的log:

2024-03-25 22:46:01,827 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2024-03-25 22:46:01,840 INFO org.apache.hadoop.http.HttpServer2: HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: hadoop102:9880
        at org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1213)
        at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1235)
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1294)
        at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1149)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:181)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:881)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:703)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:949)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:922)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1688)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1755)
Caused by: java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:438)
        at sun.nio.ch.Net.bind(Net.java:430)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:351)
        at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:319)
        at org.apache.hadoop.http.HttpServer2.bindListener(HttpServer2.java:1200)
        at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1231)
        ... 9 more
2024-03-25 22:46:01,842 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2024-03-25 22:46:01,843 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2024-03-25 22:46:01,843 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2024-03-25 22:46:01,843 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.net.BindException: Port in use: hadoop102:9880
        at org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1213)
        at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1235)
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1294)
        at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1149)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:181)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:881)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:703)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:949)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:922)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1688)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1755)
Caused by: java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:438)
        at sun.nio.ch.Net.bind(Net.java:430)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:351)
        at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:319)
        at org.apache.hadoop.http.HttpServer2.bindListener(HttpServer2.java:1200)
        at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1231)
        ... 9 more

但你去查看端口情况,其实没有被占用。直接说解决方案:

端口占用其实和 hadoop102节点的格式化NameNode有关系

查看教学笔记,有这么一段话:

如果集群是第一次启动,需要在 hadoop102 节点格式化 NameNode(注意:格式化 NameNode,会产生新的集群 id,导致 NameNode 和 DataNode 的集群 id 不一致,集群找不到已往数据。如果集群在运行过程中报错,需要重新格式化 NameNode 的话,一定要先停止 namenode 和 datanode 进程,并且要删除所有机器的 data 和 logs 目录,然后再进行格式化。)

关键点:删除所有机器的 data 和 logs 目录,然后再进行格式化

一定要把 data 和 logs 目录,然后格式化 hadopp102, 再重新启动,即可解决端口占用问题


如果上面的方式尝试过了,还是出现端口占用问题。则可能是下面的原因

/etc/hosts 目录下的文件。当前机器ip地址不能用外网来设置,要用内网ip。比如hadoop102的配置

xx.xx.xx.xx hadoop102
116.196.xx.xx hadoop103
150.158.xx.xx hadoop104

同理,hadoop103和hadoop104把各自的内网ip换掉即可

错误六:hadoop102访问hadoop103不通

[root@VM-16-2-centos hadoop-3.1.3]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /www/wordcount /wcinput3 
2024-03-31 11:02:59,509 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/116.196.110.59:8032 2024-03-31 11:03:19,841 INFO ipc.Client: Retrying connect to server: hadoop103/116.196.110.59:8032. Already tried 0 time(s); maxRetries=45

防火墙的原因,在云服务器的防火墙里放开8032端口