Created by Jerry Wang, last modified on Aug 17, 2015
The general steps could be found in this link: stackoverflow.com/questions/2…
- mkdir example-java-build/; cd example-java-build
- mvn archetype:generate
-DarchetypeGroupId=org.apache.maven.archetypes
-DgroupId=spark.examples
-DartifactId=JavaWordCount \ – 对应生成的project folder name
-Dfilter=org.apache.maven.archetypes:maven-archetype-quickstart
below is my pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>spark.examples</groupId> --- 和命令行里指定的groupid 一致
<artifactId>JavaWordCount</artifactId>--- 和命令行里指定的groupid 一致
<packaging>jar</packaging>
<version>1</version>
<name>JavaWordCount</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-examples_2.10</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.4.1</version>
</dependency>
</dependencies>
</project>
```xml
3. cd example-java-build/JavaWordCount
mvn package
This creates your fat jar file inside the target directory.

在classes folder里有零散的.class file:

Copy the jar file to any location on the server. Go to the your bin folder of your spark.
Submit spark job: ./spark-submit --class "org.apache.spark.examples.JavaWordCount" --master local /root/devExpert/spark-1.4.1/example-java- build/JavaWordCount/target/JavaWordCount-1.jar
use jd.exe to open the compiled java class, make sure the value specified by --class equals to the complate name of class,
in my example it is org.apache.spark.examples.JavaWordCount. Or else you will meet with java.lang.ClassNotFoundException.

4. ./spark-submit --class "org.apache.spark.examples.JavaWordCount" --master local /root/devExpert/spark-1.4.1/example-java-build/JavaWordCount/target/JavaWordCount-1.jar /root/devExpert/spark-1.4.1/bin/test.txt
-debug: sh -x ./spark-submit --class "org.apache.spark.examples.JavaWordCount" --master local /root/devExpert/spark-1.4.1/example-java-build/JavaWordCount/target/JavaWordCount-1.jar /root/devExpert/spark-1.4.1/bin/test.txt
等价于:/usr/jdk1.7.0_79/bin/java -cp /root/devExpert/spark-1.4.1/conf/:/root/devExpert/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.4.0.jar:/root/devExpert/spark-1.4.1/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/root/devExpert/spark-1.4.1/lib_managed/jars/datanucleus-core-3.2.10.jar:/root/devExpert/spark-1.4.1/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar -Xms512m -Xmx512m -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master local --class org.apache.spark.examples.JavaWordCount /root/devExpert/spark-1.4.1/example-java-build/JavaWordCount/target/JavaWordCount-1.jar /root/devExpert/spark-1.4.1/bin/test.txt
-cp 和 -classpath 一样,是指定类运行所依赖其他类的路径,通常是类库,jar包之类,需要全路径到jar包,window上分号“;”
分隔,linux上是分号“:”分隔。不支持通配符,需要列出所有jar包,用一点“.”代表当前路径。
output:
