核心算法
reduce个数计算
-
参数 mapreduce.job.reduces默认-1 hive.exec.reducers.max默认1009 hive.exec.reducers.bytes.per.reducer默认10G
-
算法
- ReduceWork.getNumReduceTasks()大于等于0, 则启用此设置 编译期间sql的执行计划需要强制reduce个数的情况:因为这些操作都是全局的,所以hadoop不得不用一个reduce去完成 a) 没有group by的count
--错误案例:
select count(1) from popt_tbaccountcopy_mes where pt = '2012-07-04';
--正确案例:
select pt,count(1) from popt_tbaccountcopy_mes where pt = '2012-07-04' group by pt;
b) 用了Order by
c) 有笛卡尔积
2. 设置了mapreduce.job.reduces参数,则启用此设置 3. 若均未设置,根据数据量大小评估reduce个数
- 代码 MapRedTask.setNumberOfReducers()
if (numReducersFromWork >= 0) {//1. 编译期间sql的执行计划需要强制reduce个数
console.printInfo("Number of reduce tasks determined at compile time: "
+ rWork.getNumReduceTasks());
} else if (job.getNumReduceTasks() > 0) {//2. 设置了mapreduce.job.reducers参数
int reducers = job.getNumReduceTasks();
rWork.setNumReduceTasks(reducers);
console
.printInfo("Number of reduce tasks not specified. Defaulting to jobconf value of: "
+ reducers);
} else {
if (inputSummary == null) {
inputSummary = Utilities.getInputSummary(driverContext.getCtx(), work.getMapWork(), null);
}
int reducers = Utilities.estimateNumberOfReducers(conf, inputSummary, work.getMapWork(),
work.isFinalMapRed());//3. 根据数据量大小评估reduce个数
rWork.setNumReduceTasks(reducers);
console
.printInfo("Number of reduce tasks not specified. Estimated from input data size: "
+ reducers);
}
Utilities.estimateNumberOfReducers()
double bytes = Math.max(totalInputFileSize, bytesPerReducer);//Math.max(totalInputFileSize, hive.exec.reducers.bytes.per.reducer)/hive.exec.reducers.bytes.per.reducer
int reducers = (int) Math.ceil(bytes / bytesPerReducer);
reducers = Math.max(1, reducers);
reducers = Math.min(maxReducers, reducers);
map个数计算
CombineHiveInputFormat
-
参数 hive.hadoop.supports.splittable.combineinputformat 默认false hive.input.format 默认org.apache.hadoop.hive.ql.io.CombineHiveInputFormat mapreduce.input.fileinputformat.split.maxsize(默认256MB) mapreduce.input.fileinputformat.split.minsize.per.node(默认1 byte) mapreduce.input.fileinputformat.split.minsize.per.rack(默认1 byte)
-
源码 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits()
// Process the normal splits
if (nonCombinablePaths.size() > 0) {//1. 如果path的inputFileFormatClass不可合并,则使用HiveInputFormat进行分片
FileInputFormat.setInputPaths(job, nonCombinablePaths.toArray
(new Path[nonCombinablePaths.size()]));
InputSplit[] splits = super.getSplits(job, numSplits);
for (InputSplit split : splits) {
result.add(split);
}
}
// Process the combine splits
if (combinablePaths.size() > 0) {//2.1 如果可以合并,则调用getCombineSplits获取
FileInputFormat.setInputPaths(job, combinablePaths.toArray
(new Path[combinablePaths.size()]));
Map<String, PartitionDesc> pathToPartitionInfo = this.pathToPartitionInfo != null ?
this.pathToPartitionInfo : Utilities.getMapWork(job).getPathToPartitionInfo();
InputSplit[] splits = getCombineSplits(job, numSplits, pathToPartitionInfo);//2.2 最终还是调用CombineFileInputFormat.getSplits(job, 1)
for (InputSplit split : splits) {
result.add(split);
}
}
HiveInputFormat
OrcInputFormat
hive.exec.orc.split.strategy 默认 HYBRID, 参数控制在读取ORC表时生成split的策略。
- BI策略以文件为粒度进行split划分;
- ETL策略会将文件进行切分,多个stripe组成一个split;
- HYBRID策略为:当文件的平均大小大于hadoop最大split值(默认256 * 1024 * 1024)时使用ETL策略,否则使用BI策略。
对于一些较大的ORC表,可能其footer较大,ETL策略可能会导致其从hdfs拉取大量的数据来切分split,甚至会导致driver端OOM,因此这类表的读取建议使用BI策略。 对于一些较小的尤其有数据倾斜的表(这里的数据倾斜指大量stripe存储于少数文件中),建议使用ETL策略。 另外,spark.hadoop.mapreduce.input.fileinputformat.split.minsize参数可以控制在ORC切分时stripe的合并处理。具体逻辑是,当几个stripe的大小小于spark.hadoop.mapreduce.input.fileinputformat.split.minsize时,会合并到一个task中处理。可以适当调小该值,以此增大读ORC表的并发。
mergefile开启策略
-
参数 hive.merge.mapfiles 默认true hive.merge.mapredfiles 默认false hive.merge.size.per.task 默认256M hive.merge.smallfiles.avgsize 默认16M hive.merge.supports.splittable.combineinputformat 默认true
-
源码 ConditionalResolverMergeFiles.getTasks()
if (dpCtx != null && dpCtx.getNumDPCols() > 0) {//如果是动态分区
int numDPCols = dpCtx.getNumDPCols();
int dpLbLevel = numDPCols + lbLevel;
generateActualTasks(conf, resTsks, trgtSize, avgConditionSize, mvTask, mrTask,
mrAndMvTask, dirPath, inpFs, ctx, work, dpLbLevel);
} else { // no dynamic partitions
if(lbLevel == 0) {
// static partition without list bucketing
long totalSz = getMergeSize(inpFs, dirPath, avgConditionSize);//如果是静态分区,则按照分区下文件平均大小小于avgsize则开启mergeTask.
if (totalSz >= 0) { // add the merge job
setupMapRedWork(conf, work, trgtSize, totalSz);
resTsks.add(mrTask);
} else { // don't need to merge, add the move job
resTsks.add(mvTask);
}
} else {
// static partition and list bucketing
generateActualTasks(conf, resTsks, trgtSize, avgConditionSize, mvTask, mrTask,
mrAndMvTask, dirPath, inpFs, ctx, work, lbLevel);
}
}