源码下载地址
Apache官网:archive.apache.org/dist/spark/…
Github:github.com/apache/spar…
warning!!!请在GitBash中执行编译安装
1、Change Scala Version
./dev/change-scala-version.sh 2.12
2、注释./dev/make-distribution.sh下的配置文件,并显式配置这些变量
# VERSION=$("$MVN" help:evaluate -Dexpression=project.version $@ 2>/dev/null\
# | grep -v "INFO"\
# | grep -v "WARNING"\
# | tail -n 1)
# SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version $@ 2>/dev/null\
# | grep -v "INFO"\
# | grep -v "WARNING"\
# | tail -n 1)
# SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null\
# | grep -v "INFO"\
# | grep -v "WARNING"\
# | tail -n 1)
# SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null\
# | grep -v "INFO"\
# | grep -v "WARNING"\
# | fgrep --count "<id>hive</id>";\
# # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing\
# # because we use "set -o pipefail"
# echo -n)
3、在spark源码根目录的pom.xml的<repositories>...</repositories>标签中新增Cloudera源
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
4、设置Man的环境变量
Linux:vim /etc/profile
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=1g"
Windows:
参考JDK的环境变量设置即可
MAVEN_OPTS
-Xmx2g -XX:ReservedCodeCacheSize=1g
5、Building a Runnable Distribution
./dev/make-distribution.sh --name 2.6.0-cdh5.16.2 --tgz -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests -Dscala.version=2.12.10 -Dhadoop.version=2.6.0-cdh5.16.2