HDP问题总结

348 阅读2分钟

本文已参与「新人创作礼」活动,一起开启掘金创作之路。

序列在hive里不支持

table_src是我们经过业务需求处理的到的中间表数据,现在我们需要为table_src新增一列自增序列字段auto_increment_id,并将最终数据保存到table_dest中。

insert into table table_dest\
select (row_number() over(order by 1) + dest.max_id) auto_increment_id, src.* from table_src src cross join (select max(auto_increment_id) max_id from table_dest) dest;

dataos查询超过60s,后没有返回值。dataos程序测试时,前台显示一直在执行中,实际后台已经执行完了

dataos nginx 65s 改为300 重启nginx

重启 hbase regionserver 启动不了,掉线。

使用命令行 yum remove 重装

413  request entity too large

原因:

nginx 默认的接受参数大小是 1M,但是我们上传文件大小是已经超过1M,

解决方案

找到自己主机的nginx.conf配置文件,打开
在http{}中加入 client_max_body_size 10m;
重启nginx
/etc/init.d/nginx restart

hive-staging文件产生的原因和解决方案

原因

select或者insert overwrite等sql到hive时,会产生该目录,用于临时存放执行结果,比如insert overwrite会将结果暂存到该目录下,待任务结束,将结果复制到hive表中。

解决

默认配置:

<property>
    <name>hive.exec.stagingdir</name>
<value>.hive-staging</value>
</property>

*修改后:**\

<property>
<name>hive.exec.stagingdir</name>
<value>/tmp/hive/.hive-staging</value>
</property>

设置项为 tez.staging-dir ,默认值为 /tmp/tez/staging

问题

ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 5, vertexId=vertex_1637642623048_49568_1_00Vertex re-running, vertexName=Map 5, vertexId=vertex_1637642623048_49568_1_00Vertex failed, vertexName=Reducer 3, vertexId=vertex_1637642623048_49568_1_02, diagnostics=[Task failed, taskId=task_1637642623048_49568_1_02_000046, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1637642623048_49568_1_02_000046_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row


          at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)

          at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)

          at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)

          at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)

          at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)

          at java.security.AccessController.doPrivileged(Native Method)

          at javax.security.auth.Subject.doAs(Subject.java:422)

          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)

          at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)

          at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)

          at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

          at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)

          at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)

          at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)

          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

          at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row

          at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)

          at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)

          at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)

          ... 16 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row

          at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)

          at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)

          ... 18 more

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Could not get block locations. Source file "/tmp/staging/.hive-staging_hive_2021-11-27_19-52-13_665_7246302871668790688-4691/_task_tmp.-ext-10000/acct_month=201410/eday_id=20211123/_tmp.000046_0" - Aborting...block==null

          at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1041)

          at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)

          at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)

          at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)

          at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)

          at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)

          at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.handleOutputRows(PTFOperator.java:337)

          at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.processRow(PTFOperator.java:325)

          at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:139)

          at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

          at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)

          at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)

          at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)

          ... 19 more

解决

解决:修改DataNode max data transfer threads = 8192

参考daimajiaoliu.com/daima/485d6…