HDFS之API编程

142 阅读1分钟

一、编程环境

  • 1、IDE工具:IntelliJ IDEA 2020.1
  • 2、构建工具:Maven
  • 3、确保你的hadoop环境可用,部署方式:Hadoop伪分布式搭建部署
  • 4、生态圈体系版本:[Hadoop]-2.6.0-cdh5.16.2.tar.gz

二、IDEA构建hadoop项目

1、修改Maven默认本地仓库存放路径设置

(1)修改apache-maven-3.6.3\conf下的settings.xml文件

(2)Linux服务端也同样修改Maven默认本地仓库存放路径设置

[hadoop@xinxingdata ~]$ vim /home/hadoop/app/maven/conf/settings.xml

2、IDEA基于Maven构建hadoop项目

(1)IDEA整合Maven

(2)Maven构建hadoop项目

构建成功后会显示BUILD SUCCESS

如果构建速度过慢,请更换为阿里的maven仓库
打开maven在本地的位置,找到conf文件夹下的setting,xml打开,在mirrors标签中将下面代码复制进去

<mirror>
      <id>nexus-aliyun</id>
      <mirrorOf>central</mirrorOf>
      <name>nexus-aliyun</name>
      <url>http://maven.aliyun.com/nexus/content/groups/public</url> 
    </mirror>

3、修改Maven pom文件

 <?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.xinxingdata.bigdata</groupId>
  <artifactId>xinxingdata</artifactId>
  <version>1.0</version>

  <name>xinxingdata</name>
  <!-- FIXME change it to the project's website -->
  <url>http://www.example.com</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
    <hadoop.version>2.6.0-cdh5.16.2</hadoop.version>
  </properties>


  <repositories>
    <!-- 阿里云仓库 -->
    <repository>
      <id>aliyun</id>
      <url>http://maven.aliyun.com/nexus/content/groups/public</url>
    </repository>

    <!-- CDH仓库 -->
    <repository>
      <id>cloudera</id>
      <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    </repository>
  </repositories>

  <dependencies>

    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
      <version>${hadoop.version}</version>
    </dependency>

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
      <scope>test</scope>
    </dependency>
  </dependencies>

  <build>
    <pluginManagement><!-- lock down plugins versions to avoid using Maven defaults (may be moved to parent pom) -->
      <plugins>
        <!-- clean lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#clean_Lifecycle -->
        <plugin>
          <artifactId>maven-clean-plugin</artifactId>
          <version>3.1.0</version>
        </plugin>
        <!-- default lifecycle, jar packaging: see https://maven.apache.org/ref/current/maven-core/default-bindings.html#Plugin_bindings_for_jar_packaging -->
        <plugin>
          <artifactId>maven-resources-plugin</artifactId>
          <version>3.0.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-compiler-plugin</artifactId>
          <version>3.8.0</version>
        </plugin>
        <plugin>
          <artifactId>maven-surefire-plugin</artifactId>
          <version>2.22.1</version>
        </plugin>
        <plugin>
          <artifactId>maven-jar-plugin</artifactId>
          <version>3.0.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-install-plugin</artifactId>
          <version>2.5.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-deploy-plugin</artifactId>
          <version>2.8.2</version>
        </plugin>
        <!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
        <plugin>
          <artifactId>maven-site-plugin</artifactId>
          <version>3.7.1</version>
        </plugin>
        <plugin>
          <artifactId>maven-project-info-reports-plugin</artifactId>
          <version>3.0.0</version>
        </plugin>
      </plugins>
    </pluginManagement>
  </build>
</project>

配置好后刷一下,让pom里配置过的依赖导入进来

4、将项目所需要的的依赖添加到maven pom文件夹下

Maven仓库地址:mvnrepository.com