代码仓库地址:https://gitee.com/jikeh/BigData
开发环境:maven、IntelliJ IDEA
环境搭建:Hadoop单节点集群环境搭建
1、启动hadoop
cd /usr/local/Env/Hadoop/hadoop-2.6.5/sbin/
./start-all.sh
2、HDFS创建目录
hadoop fs -mkdir -p /jikeh/datasort/input
3、上传文本文件到HDFS
输入文件:
text1:
9
8
7
6
1
2
3
text2:
999
99999
99999
9999
999
99
9
hadoop fs -copyFromLocal /usr/local/src/* /jikeh/datasort/input
4、运行datasort程序
语法:hadoop jar ***.jar [输入文件] [输出目录]
hadoop jar datasort-0.0.1-SNAPSHOT.jar com/jikeh/hadoop/datasort/DataSort
hadoop fs -ls /jikeh/datasort/output
hadoop fs -cat /jikeh/datasort/output/part-r-00000