hadoop入门

172 阅读1分钟

介绍

hadoop包括很多部分

  • hdfs
  • yarn 等等

安装

tar -xvf hadoop-3.4.0.tar.gz
mv hadoop-3.4.0/ hadoop

修改配置

cd hadoop
vim  etc/hadoop/hadoop-env.sh
## 添加JAVA_HOME 到hadoop-env.sh
JAVA_HOME=/usr/lib/jvm/java-23-openjdk-amd64/

添加配置: vim etc/hadoop/core-site.xml

<configuration>  
<property>  
  <name>fs.defaultFs</name>  
  <value>hdfs://localhost:9000/</value>  
</property>  
<property>  
       <name>dfs.data.dir</name>  
       <value>/home/dai/hdata/data</value>  
</property>  
<property>  
       <name>dfs.name.dir</name>  
       <value>/home/dai/hdata/name</value>  
</property>  
</configuration>

ssh 免密登录

### 一直回车,会生成 .ssh/id_rsa.pub
ssh-keygen -t rsa

ssh-copy-id -i .ssh/id_rsa.pub  www@localhost 

启动hdfs

## 当前目录是 hadoop主目录
./bin/hadoop namenode -format
## 切换到sbin
cd sbin/
## 启动hdfs
./start-dfs.sh

hdfs 的webui 端口在3.0之后从是8970,所以访问下面地址就可以看到界面

http://127.0.0.1:9870/dfshealth.html#tab-overview

启动yarn

## 切换到sbin
cd sbin/
## 启动hdfs
./start-yarn.sh

yarn的webui端口是8088,所以要访问yarn的webui只要访问下面地址:

http://127.0.0.1:8088/cluster

相关错误

没有免密登陆或者ssh server 没有安装

Starting namenodes on [dai]  
dai: ssh: connect to host dai port 22: Connection refused  
Starting datanodes  
localhost: ssh: connect to host localhost port 22: Connection refused  
Starting secondary namenodes [dai]  
dai: ssh: connect to host dai port 22: Connection refused

解决方案:

sudo apt install openssh-server

没有format

2024-07-20 09:14:23,975 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-dai/dfs/name is in an inconsistent st  
ate: storage directory does not exist or is not accessible.

解决方案:

./bin/hadoop namenode -format

java 高版本兼容

Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws j  
ava.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module @72c94248

修改文件etc/hadoop/yarn-env.sh 添加环境变量:YARN_OPTS

export HADOOP_OPTS="--add-opens java.base/java.lang=ALL-UNNAMED"

上传报错

Permission denied: user=dr.who, access=WRITE, inode=\"/\":hadoop:supergroup:drwxr-xr-x\n\tat org.apache.hadoop.hdfs.server.namenode.

参考 解决方案:core-site.xml添加下面配置, 这个value是有权限的用户的用户名,我这里是hadoop这个用户

<property> 
    <name>hadoop.http.staticuser.user</name> 
    <value>hadoop</value> 
</property>

相关阅读