Hadoop 3.x 伪分布式部署与操作指南

一、伪分布式环境准备
1.1 架构示意图
graph TD
NN[NameNode] --> DN[DataNode]
RM[ResourceManager] --> NM[NodeManager]
DN -->|心跳| NN
NM -->|资源汇报| RM
style NN fill:#4CAF50
style RM fill:#2196F3
1.2 前置条件检查
# 环境验证脚本
import subprocess
def check_environment():
checks = {
"Java Version": "java -version",
"SSH Localhost": "ssh localhost hostname",
"Disk Space": "df -h /"
}
for desc, cmd in checks.items():
try:
output = subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT)
print(f"✅ {desc} 验证成功")
except subprocess.CalledProcessError as e:
print(f"❌ {desc} 失败: {e.output.decode()}")
check_environment()
二、Hadoop 3.x 安装配置
2.1 安装流程
flowchart TD
A[下载Hadoop] --> B[解压安装]
B --> C[配置环境变量]
C --> D[修改配置文件]
D --> E[格式化HDFS]
E --> F[启动服务]
2.2 详细步骤
- 下载并解压Hadoop
wget https://archive.apache.org/dist/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz
tar -xzf hadoop-3.3.6.tar.gz -C /opt
sudo ln -s /opt/hadoop-3.3.6 /opt/hadoop
- 配置环境变量
# Python脚本生成环境配置
with open("/etc/profile.d/hadoop.sh", "w") as f:
f.write("""\
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
""")
subprocess.run("source /etc/profile", shell=True)
- 核心配置文件修改
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop/data</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file://${hadoop.tmp.dir}/namenode</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
三、HDFS文件系统操作
3.1 服务启停命令
# 格式化NameNode(首次安装)
hdfs namenode -format
# 启动HDFS
start-dfs.sh
# 启动YARN
start-yarn.sh
# 停止所有服务
stop-all.sh
3.2 文件系统操作示例
import subprocess
class HDFSClient:
def __init__(self, user="hadoop"):
self.user = user
def run_cmd(self, cmd):
full_cmd = f"hdfs dfs -D fs.defaultFS=hdfs://localhost:9000 -{cmd}"
result = subprocess.run(full_cmd.split(), capture_output=True)
return result.stdout.decode()
def mkdir(self, path):
return self.run_cmd(f"mkdir -p /user/{self.user}/{path}")
def put_file(self, local, remote):
return self.run_cmd(f"put {local} /user/{self.user}/{remote}")
# 使用示例
hdfs = HDFSClient()
print(hdfs.mkdir("input"))
print(hdfs.put_file("localfile.txt", "input/"))
3.3 Web UI验证
- NameNode UI: http://localhost:9870
- DataNode UI: http://localhost:9864
- YARN UI: http://localhost:8088

四、YARN资源管理
4.1 资源调度模型
可用资源 = \sum_{i=1}^{n} (NodeManager_i.memory \times NodeManager_i.vcores)
4.2 提交MapReduce作业
# 运行WordCount示例
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar \
wordcount /user/hadoop/input /user/hadoop/output
4.3 资源监控命令
# YARN应用监控脚本
def yarn_app_monitor():
cmd = "yarn application -list -appStates ALL"
output = subprocess.check_output(cmd.split()).decode()
apps = [line.split("\t") for line in output.split("\n")[2:-1]]
return {
"running": len([a for a in apps if "RUNNING" in a]),
"completed": len([a for a in apps if "SUCCEEDED" in a])
}
print(yarn_app_monitor())
五、常见问题排查
5.1 故障诊断表
| 现象 | 可能原因 | 解决方案 |
|---|---|---|
| NameNode启动失败 | 端口冲突/目录权限问题 | 检查9000端口,修改hadoop.tmp.dir权限 |
| DataNode未注册 | 集群ID不一致 | 格式化前清理所有数据目录 |
| YARN作业卡住 | 内存分配不足 | 调整yarn.nodemanager.resource.memory-mb |
5.2 日志查看指南
flowchart LR
A[日志类型] --> B[Namenode]
A --> C[Datanode]
A --> D[ResourceManager]
B --> E[$HADOOP_HOME/logs/hadoop-*-namenode-*.log]
C --> F[$HADOOP_HOME/logs/hadoop-*-datanode-*.log]
D --> G[$HADOOP_HOME/logs/yarn-*-resourcemanager-*.log]
六、环境验证清单
- HDFS基础操作正常(创建目录/上传文件)
- Web UI可访问且显示节点状态正常
- YARN可成功运行MapReduce作业
- 日志文件无ERROR级别错误信息
下一章预告:在掌握单机部署后,我们将进入Spark Standalone模式部署,学习如何运行PySpark应用并进行性能调优。
附录:Hadoop 3.x端口对照表
| 服务 | 端口 | 协议 | 用途 |
|---|---|---|---|
| NameNode | 9000 | TCP | HDFS文件系统访问 |
| NameNode Web | 9870 | HTTP | 元数据管理界面 |
| DataNode | 9864 | HTTP | 数据块存储状态 |
| ResourceManager | 8088 | HTTP | 集群资源管理界面 |
| NodeManager | 8042 | HTTP | 节点资源状态汇报 |